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Abstract 

The problem of finding large cliques in random graphs and its "planted" variant, where one 
wants to recover a clique of size co » log ( n ) added to an Erdos-Renyi graph G ~ G(n, 1), have 
been intensely studied. Nevertheless, existing polynomial time algorithms can only recover 
planted cliques of size co = Q( -\fn). By contrast, information theoretically, one can recover 
planted cliques so long asw» log ( n ). 

In this work, we continue the investigation of algorithms from the sum of squares hierarchy 
for solving the planted clique problem begun by Meka, Potechin, and Wigderson [MPW15] and 
Deshpande and Montanari [DM15]. Our main results improve upon both these previous works 
by showing: 

1. Degree four SoS does not recover the planted clique unless co » yfnl polylog n, improving 
upon the bound co » rA 3 due to [DM15]. A similar result was obtained independently by 
Raghavendra and Schramm [RS15]. 

2. For 2 < d = o( sj log («)), degree 2d SoS does not recover the planted clique unless 
co » ft 1 /(‘ 1+1 V(2' 1 polylog n), improving upon the bound due to [MPW15]. 

Our proof for the second result is based on a fine spectral analysis of the certificate used in the 
prior works [MPW15, DM15, FK03] by decomposing it along an appropriately chosen basis. 
Along the way, we develop combinatorial tools to analyze the spectrum of random matrices with 
dependent entries and to understand the symmetries in the eigenspaces of the set symmetric 
matrices inspired by work of Grigoriev [GriOla]. 

An argument of Kelner shows that the first result cannot be proved using the same certificate. 
Rather, our proof involves constructing and analyzing a new certificate that yields the nearly 
tight lower bound by "correcting" the certificate of [MPW15, DM15, FK03]. 
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1 Introduction 


Let G(n, p) be the Erdos-Renyi random graph where each edge is present in G with probability p 
independently of others. It is an easy calculation that the largest clique in G ~ G(n, f) is of size 
(2 + o(l)) • log (ft) with high probability Recovering such a clique using an efficient algorithm has 
been a long standing open question in theoretical computer science. As early as 1976, Karp [Kar76] 
suggested the impossibility of finding cliques of size even (1 + t:) log (ft) for any constant £ > 0 in 
polynomial time. Karp's conjecture was remarkably prescient and has stood ground after nearly 4 
decades of research. 

Lack of algorithmic progress on the question motivated Jerrum [Jer92] and Kucera [Kuc95] to 
consider a relaxed version known as the planted clique problem. In this setting, we are given a graph 
G obtained by planting a clique of size co on a graph sampled according to G(ft, Information 
theoretically, the added clique is identifiable as long a co » log (ft). The goal is to recover the added 
clique via an efficient algorithm for as small an co as possible. This variant is also connected to 
the question of finding large communities in social networks and the problem of signal finding 
in molecular biology [PS000]. Despite attracting a lot of attention, the best known polynomial 
time algorithm can only find planted cliques when their size co = Q( x/77) [AKS98, FK03]. The LS+ 
semi-definite programming hierarchy leads to the state of the art trade off: planted cliques of size 

co « can be recovered in time n°^ for any d = 0(log (ft). 

Recently, this difficulty of finding cliques of size co <sc yjn has led to an increasing confidence 
in planted clique being a candidate for an average case hard problem and has inspired new 
research directions in cryptography [ABW10], property testing [AAK + 07], machine learning [BR13], 
algorithmic game theory [DBL09, ABC11] and mathematical finance [DBL10]. 

In this paper we are interested in understanding the Sum of Squares (SoS, also known as Lasserre) 
semi-definite programming (SDP) hierachy [LasOl, ParOO] for the planted clique problem. This is a 
family of algorithms, paramterized by a number d called the degree, where the d th algorithm takes 
ftO(rf) q me t 0 execute. The sum of squares hierarchy can be viewed as a common generalization and 
extension of both linear programming and spectral techniques, and as such has been remarkably 
successful in combinatorial optimization. In particular it captures the state of the art algorithms for 
problems such as Sparsest Cut [ARV04], MAX CUT [GW95], Unique Games/Small Set Expansion 
[ABS10, BRS11, GS12], Recently, [BBH + 12] showed that a polynomial time algorithm from this 
hierarchy solves all known integrality gap instances of the Unique Games problem, and similar 
results have been shown for the hard instances of MAX-CUT [DMN13] and Balanced Separator 
[OZ13]. Very recently, [LRS15] showed that the sum of squares algorithm is in fact optimal amongst 
all efficient algorithms based on semidefinite programming for a large class of problems that 
includes constraint satisfaction and the traveling salesman problem. Moreover, Barak, Kelner 
and Steurer [BKS14, BKS15] used the SoS hierarchy to give improved algorithms to average case 
problems such as the dictionary learning problem and the planted sparse vector problem that have 
at least some similarity to the planted clique problem.Thus several researchers have asked whether 
the SoS hierarchy can yield improved algorithms for this problem as well. 

The first published work along these lines was of Meka, Potechin and Wigderson [MPW15] 
who showed that for every d ^ 2, the degree 2d SoS cannot find planted cliques of size smaller 
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than « ns*- 1 Deshpande and Montanari [DM15] independently proved a tighter lower bound of 
~ h i/ 3 f or th e case of degree 4. In the main result of this paper, we extend the prior works and show 
that the first non trivial extension of the spectral algorithm, namely the SoS algorithm of degree 4, 
cannot find cliques of size ~ yfn, a bound optimal within poly log (n) factors. Our lower bound for 
degree 4 is obtained by a careful "correction" to the certificate used by [MPW15] and [DM15] in 
their lower bounds. 

Theorem 1.1 (Main Theorem 1). The canonical degree 4 SoS relaxation of the planted clique problem 
((2.1)) has an integrality gap of at least 0( x/n) with high probability. 2 . 

A similar result was obtained in an independent work by Raghavendra and Schramm [RS15]. 
In our second main result, we give a tight analysis of the certificate considered by [MPW15] and 
[DM15] and show that it yields a lower bound of ft3+r. 

Theorem 1.2 (Main Theorem 2). For every d = o( ^/log (/;)), the canonical degree 2d SoS relaxation for 
the planted clique problem ((2.1)) has an integrality gap of at least 0(u3+1). 

The certificate of [MPW15, DM15] is sufficient to show an Q( xjn/2 d ) lower bound for the 
degree d LS+ hierarchy [FK03] (which is a weaker SDP that also runs in time n 0(d> ). However, a 
generalization of an argument of Kelner (see Section 10) shows that this is not the case for the SoS 
hierarchy, and our analysis of this certificate is tight. Hence our work shows that to get stronger 
lower bounds for higher degree SOS it is necessary and sufficient to utilize more complicated 
constructions of certificates than those used for weaker hierarchies. Whether this additional 
complexity results in an asymptotically better tradeoff between the running time and clique size 
remains a tantalizing open question. 

2 Technical Overview 

The SoS semidefinite programming hierarchy yields a convex programming relaxation for the 
planted clique problem. That is, we derive from the input graph G a convex program Vg such that 
if the graph had a clique of size co then Vq is feasible. To show that the program fails to solve the 
planted clique problem with parameter co, we show that with high probability there is a solution 
(known as a certificate ) for the program Vg even when G is a random graph from G(n, 1/2) (which 
in particular will not have a clique of size :» log n). 

The solution to degree d hierarchies can be thought of as a vector X € R”. For linear programming 
hierarchies this vector needs to satisfy various linear constraints, while for semi-definite programming 
languages it also needs to satisfy constraints of the form M > 0 where M is a matrix where each entry 
is a linear function of X. In previous SoS lower bounds for problems such as random 3XOR/3SAT, 
Knapsack, and random constraint-satisfaction problems [GriOla, Sch08, BCK15], the certificate X 
was obtained in a fairly natural way, and the bulk of the work was in the analysis. In fact, in all 

1 We use s; to denote equality up to factors polylogarithmic in n (the size of the graph) and with an arbitrary dependence 
on the degree parameter d. 

throughout this paper, we use O to hide polylogarithmic factors in n 
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those cases the certificate used in the SoS lower bounds was the same one that was used before for 
obtaining lower bounds for weaker hierarchies [FK03]. The same holds for the previous works for 
the planted clique problem, where the works of [MPW15, DM15] used a natural certificate which is 
a close variant of the certificate used by Feige and Krauthgamer [FK03] for LS+ lower bounds, and 
showed that it satisfies the stronger conditions of the SoS program. 

It is known that such an approach cannot work to obtain a w \Jn lower bound for the SoS 
program of degree 4 and higher. That is, this natural certificate does not satisfy the conditions 
of the SoS program. Flence to obtain our lower bound for degree 4 SoS we need to consider a 
more complicated certificate, that can be thought of as making a global "correction" to the simple 
certificate of [MPW15, DM15]. For higher degrees, we have not yet been able to analyze the 
corresponding complex certificate, but we are able to give a tight analysis of the simple certificate, 
showing that it certifies an co w « 1 /( d+1 ) lower bound on degree 2d SoS relaxation. The key technical 
difficulty in both our work and prior ones is analyzing the spectrum of random matrices that have 
dependent entries. Deshpande and Montanari [DM15] achieved such an analysis by a tour-de-force 
of case analysis specifically tailored for the degree 4 case. However, the complexity of this argument 
makes it unwieldy to extend to either the case of the more complex certificate or the case of 
analyzing the simple certificate at higher degrees. Thus, key to our analysis is a more principled 
representation-theoretic approach, inspired by Grigoriev [GriOla], to analyzing the spectrum of 
these kind of matrices. We hope this approach would be of use in further results for both the 
planted clique and other problems. 

We now give an informal overview of the SoS program for planted clique, the [MPW15, DM15] 
certificate, our correction to it, and our analysis. See Section 3 and [BS14] for details. 

2.1 The SoS program for MAX CLIQUE 

Let G = G([n], E ) be any graph with the vertex set [n] and edge set E. The following polynomial 
equations ensure that any assignment x £ M” must be the characteristic vector of an cu-sized clique 
in G: 


xf = xi for all i e [n\ 

Xi ■ Xj = 0 for all {z,;} £ E 

n 

^ J x i =a>. ( 2 . 1 ) 

i=l 

There is a related formulation (which we refer to as the "optimization version") where the constraint 
YJi=\ x i = a) is not present and instead we have an objective function 1 x, to maximize. This 
latter formulation is used by [DM15] in their work. It is also the program of focus in the work of 
Raghavendra and Schramm [RS15] who independently of us, show an almost optimal lower bound 
for the planted clique problem for the case of degree 4 SoS relaxation. A point feasible for (2.1) is 
easily seen to be feasible for the optimization version with value co and hence using the variant (2.1) 
only makes our results stronger. It is unclear, however, whether an explicit constraint of Y,\Li x i = <0 
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adds more refutation power to the program. 3 

The degree d SoS hierarchy optimizes over an object called as degree-rf pseudo-expectation or 
pseudo-distribution. A degree-d pseudo-expectation operator for (2.1) is a linear operator E that behaves 
to some extent as the expectation operator for some distribution over x e M" that satisfies the 
conditions (2.1). For example this operator will satisfy that EL”=i x i = w > Ex 3 = E Xj, etc.. More 
formally, E is a linear operator mapping every polynomial P of degree at most d into a number EP 
such that El = 1, EP 2 ^ 0 for every P of degree at most d /2, EPQ = 0 for every Q of degree at most 
d - degP and P such that the constraint {P = 0} appears in (2.1). Note that since the dimension of 
the set of n-variate polynomials of degree at most d is at most n d , the operator E can be described as 
a vector in M. d . Moreover, the constraints on E can be captured by a semidefinite program, and this 
semidefinite program is in fact the SoS program. See Section 3, the survey [BS14] or the lecture 
notes [Barl4] for more on the SoS hierarchy. 

2.2 The "Simple Moments" 

To show a lower bound of co, we need to show that for a random graph G, we can find a degree 
d pseudo-expectation operator that satisfies (2.1). Both the papers [MPW15] and [DM15] utilize 
essentially the same operator, which we call here the "simple moments". It is arguably the most 
straightforward way to satisfy the conditions of (2.1), and the bulk of the work is then in showing 
the positivity constraint that EP 2 E 0 for every P of degree E 2 (in the degree 4 case). [DM15] shows 
that this will hold as long as co <3C h i,/3 and an argument of Kelner (see Section 2.3 below) shows that 
this is tight and in fact these simple moments fail to satisfy the positivity conditions for co » )z l /3 . 

To define a degree d pseudo-expectation operator E, we need to choose some basis [Pi,..., P,v} 
for the set of polynomials of degree at most d and define EP, for every i. The simplest basis is simply 
the monomial basis. Moreover, since our pseudo-expectation satisfies the constraints [x 2 = x,}, we 
can restrict attention to multilinear monomials, of the form xs = Il/es x i f° r some S c [n]. Note also 
that the constraints XiXj = 0 for [i, j\ g E imply that we must define Exs = 0 for every S that is not a 
clique in G. Indeed, the pseudo-distribution [x] is supposed to mimic an actual distribution over 
the characteristic vectors of ai-sized cliques in G, and note that in any such distribution it would 
hold that Ex$ = 0 when S is not a clique. 

The simplest form of such a pseudo-distribution is to set 

{ 0 S is not a clique 

a|s| otherwise 

where d|s| is a constant depending only on the size of S. We can compute the value «|s| by noting 
that we need to satisfy EQE x,/ = JE ,■ Ex (1 • • • x 1(: = oo [ for every £ = 1,..., d. Since there would 

be about (”)2“(z) ^-sized cliques in the graph G, the value ay will be « (f) - 4 

3 The reason, as we describe when discussing pseudoexpectations, is that adding {p = 0} as a constraint ensures that 
E [qp] = 0 for every deg((j) E d — deg(p) in addition to E [p] = 0. 

4 One actually needs to make some minor modifications to these moments to ensure they satisfy exactly the constraint 
E x, = oj as is done in [MPW15] and in our technical section. However, these corrections have very small magnitudes and 
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This pseudo-distribution is essentially the same one used by Feige and Krauthgamer [FK03] for 
LS+, where they were shown to be valid for the constraints of this problem as long as co < yjn/2 d+l . 
Initially Meka and Wigderson conjectured that a similar bound holds for the SoS program, or in 
other words, that the (<^ 2 ) x (<’J / 2 ) ma hix M where Ms, t = Exs*r for every S,T c [n] of size ^ d/2 
is positive semidefinite as long as co <$: yfn. The Meka-Wigderson conjecture would have held if 
the off-diagonal part of M, which is a random matrix with dependent entries, would have a spectral 
norm comparable with an independent random matrix with entries of a similar magnitude. However, 
this turns out to fail in quite a strong way. An argument due to Jonathan Kelner, described in 
[Barl4], shows that (for d = 4) the matrix M is not positive semidefinite as long as co » W We 
review this argument below, as it is instructive for our correction. 

2.3 There is such a thing as too simple 

In the simple moments, every 4-clique S gets the same pseudo-expectation. In some sense these 
moments turn out to be "too random" in that they fail to account for some structure that the 
graph possesses. Specifically, for i e [n], consider the linear function r,(.r) = £L r i,j x j where r;j 
equals +1 when [i, /’} e E, equals -1 when {i, j] £ E, and equals 0 when i = /'. Now, consider the 
polynomial P(x) = YH=\ r i( x ) 4 - For every x that is the characteristic vector of an cu-clique in G, 
P(x) ^ co(co - l) 4 ^ os’/ 2; indeed for every i in the clique, q(x) 4 would equal (co - l) 4 . On the other 
hand, for every i, let us consider the expectation of E q(v) 4 taken over the choice of the random graph 
G. Note that in a random graph the r, y's are i.i.d. +1 random variables, and hence 

EEq(x) 4 = EE(^ r i,j x ]t ~ Xj Eri di r id2 r i,h r ch Ex h x h x h x k ■ (2.2) 

Let us group the terms on the RHS of (2.2) based on the number of distinct /Vs. There are 
0(n 2 ) terms corresponding to two distinct /Vs, each of them is multiplied by a 2 « and so they 
contribute a total of Coo 2 to the expectation for some constant C. In the terms corresponding to three 
or four /Vs, there is always one variable fyy that is not squared, and hence their contribution to the 
expectation is zero. There are 0(n) terms corresponding to a single /V each multiplied by 'f t and so 
their total contribution is at most co. We get that in expectation Eq(x) 4 ^ Cco 2 for some constant C 
and by Markov this holds with high probability as well (for some different choice of the constant). 

The conclusion is that while for every cu-sized clique x, P(x) ^ co 5 / 2, the simple moments satisfy 
that EP(X) ^ Cnco 2 . When co » n 1 ^ 3 this yields a strong discrepancy between the value the simple 
moments give P and the value that they should have given, had they corresponded to an actual 
distribution on ca-sized cliques. This discrepancy can be massaged into a degree 2 polynomial Q 
such that EQ 2 < 0 for the simple moments when co » n ] ^, thus showing that in this case these 
moments do not satisfy the SoS program. 

so all the observations below apply equally well to the modified moments, and so we ignore this issue in this informal 
overview. 


5 




2.4 Fixing the simple moments 

Our fix for the simple moments is directly motivated by the example above. We want to ensure that 
the polynomial P will get a pseudo-expectation of « co 5 , and that in fact for every i Er ; (x) 4 will be 
roughly co 5 In. The idea is to break the symmetry between different equal-sized cliques and give a 
significantly higher pseudo-expectation to cliques that are somewhat over-represented by these 
polynomials. Specifically for every set S, define r$ = E, 11 ,es r i,j- Note that r$ is a sum of n entries 
in {±1}, and in a random graph it behaves roughly like a normal variable with mean 0 and variance 
n. Roughly speaking, the corrected moments will set 

E[x s ] = a|s|(l + r s co/n) 

for every clique S. Note that when co = e sfn, the correction factor would typically be of the form 

1 ± ©( c ). 5 

Computing the pseudo-expectation Er,(x) 4 under the new moments we again get the expression 

h,h>h’k 

If we now focus on the contribution of the h 4 terms where the //'s are all distinct, we see that each 
such set S yields the term 

E r^a^co/n . 

Since a± - Gd 4 /h 4 we get that Er,(x) 4 = Cco 5 /n as desired. 

2.5 Analyzing the corrected moments 

The above gives some intuition why the corrected moments might be better than the simple 
moments for one set of polynomials. But a priori it is not at all clear that those polynomials 
encapsulated all the issues with the simple moments. Moreover, it is also unclear whether or not the 
correction itself could introduce additional issues, and create new types of negative eigenvectors. 
Ruling out these two possibilities is the crux of our analysis. 

Here we discuss some key points from our analysis of E. Since [DM15] carried out a thorough 
analysis of the degree-4 simple moments, we begin by reviewing their approach. 

Approach of [DM15]. The PSDness of E reduces to proving PSDness of a related matrix A1 £ 
r( [ 2 1 ) x ( [ 2 I ) / M{S, T) = E x s x T . The eigendecomposition of E = E c [ yVl] has three eigenspaces Vq, V\, Vq 
and eigenvalues Aq ~ co 4 /n 2 on Vq, A\ ~ ar’/n 2 on V\, and A 2 « co 2 /n 2 on Vq. Next, write A1 as a 
block matrix with blocks Mij = II v/. Mil v, where II \/ projects to the subspace V. On the diagonal 
blocks, E contributes large positive eigenvalues. If the on-diagonal blocks Mu > A t I for some A/s so 
that ||M i; j| <?c yjAiAj, then A1 will be PSD. 

5 While it might seem that there is a chance for these pseudo-expectations to be negative, if a> < \Jn/ polylog(n) then it 
is exceedingly unlikely that there will exists an S such that rg| > n/a>, and so we ignore this issue in this overview. 
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Figure 1: The block matrix / subspace decomposition view, before and after the correction. Uninteresting entries left 
empty. 

Because of the dependencies in the random matrix At, the deviation from expectation varies 
according to the eigenspace. Thus, in [DM15], the deviation from the expectation is analyzed by 
first decomposing along the eigenspaces Vo, V\, Vi. 

A second technical idea is required to carry out this decomposition. Because of the symmetries 
present in the spaces Vq, V\, Vi, this decomposition is very nearly the same as splitting up the matrix 
M in an ostensibly unrelated way. Each entry Al(/, /) of A1 for /, / 6 (^) is the 0/1 indicator for the 
presence of a clique on I U /. This indicator is just the AND of all the +1 indicators g^ for the presence 
of the edges b e S(I U /). Taking a Fourier decomposition of this suggests a way to decompose A1 
as Al = Zj su bsets s of edges on 4 nodes -Ms, 6 where the matrix Als corresponds to the Fourier character S. 
The matrices Als can be matched up to the subspaces Vo, V\, Vi in such a way that those matrices 
with larger spectral norm (corresponding to larger deviations from expectation) have subspaces with 
smaller eigenvalues in their kernels! 

Pinpointing the failure of the simple moments. The foregoing is missing one subtlety. Some 
monomial matrices Als do not match nicely to a single subspace. Instead they form cross terms: 
for example, having Vi in the left kernel but not in the right kernel (not all these matrices need be 
symmetric). In fact, it is just such a matrix which keeps the simple moments from remaining PSD 
beyond co ~ n b 3 . 

For the four nodes a\,ai, b\, hi, consider the monomial g nh b, 9,i h b 2 and the corresponding matrix 
Ms({ai,ai}, {b\, bi}) « (a> 4 /n 4 )g ai j :ll g a . h i, 2 . The entries Als do not depend on a\, and so there are many 
repeated rows, which creates a much larger spectral norm for Als than if it had independent entries. 
[DM15] prove the (tight) bound ||Alsll ~ co 4 /n 7 1 2 . At the same time, it turns out only to have Vi in 
its left kernel, not its right one. Appealing to the above picture, in order to have co 4 /n 7 ^ <sc ar’^/n 2 , 
we must have co tA 3 . 

Analysis of the correction. We make one further observation about the matrix M' s from the 
previous section: its rows are the tensor squares of the +1 neighborhood indicator vectors r, 

6 This is not quite the whole picture, see Section 7.1.1. 
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from above. Our fix to the simple moments, described above as adjusting individual pseudo¬ 
expectations, amounts roughly to adding to M the matrix (co 5 /n 5 ) L,( r f 2 )( r f 2 ) + - This carves out 
of V2 (our worst subspace from an eigenvalue perspective) a new subspaces space V\ 5 with 
eigenvalues lower-bounded by A 1.5 ~ aP/rt :» a> 2 /n 2 . Now instead of matching the bad matrix Ms 
to V\ and V2 as a cross term, we can match it to V\ and V 1.5 as a cross term. Then we only need 
o> 4 /n 7 '' 2 <sc Vd'i A15 « &) 4 /h 7/2 . With some care in the details, the above picture can be made precise. 

However, a crucial point is that the matrix N = (co 5 /n 5 ) L,( r f 2 )( r f 2 ) + doesn't satisfy the clique 
constraints (in that all entries I, J with I U / not a clique should be 0). A chunk of our proof goes 
into analyzing the discrepancy between the matrix N and its zeroed out version. Our analysis here 
requires the use of new combinatorial tools (Section 6.4) combined with the trace moment method. 

Symmetries of Eigenspaces and Tight Analysis of MPW Operator for Higher Degrees. As we 

have alluded to already, a key technical step in our proof is to show that certain Fourier-decomposed 
matrices of the form discussed above have some of the subspaces Vo, V\, V2 in their kernels. In 
the analysis of simple moments for degree 4, [DM15] use explicit entries for canonical forms of 
eigenvectors in Vo, V\ to accomplish this. However this approach hits analytical roadblocks for 
the analysis in case of higher degrees. Canonical forms for the eigenvectors are hard to pin down 
explicitly from the literature in algebraic combinatorics. 

To mitigate this difficulty, we take a more principled approach to understand the eigenspaces 
Vo, V\,... , V L f C r( d ) in terms of their symmetries. Using techniques from basic representation 
theory of finite groups, we arrive at an explicit family of symmetries that express any vector in 
Vi C r(V) as an explicit linear transformation of some vector in It also shows that any 

v € V{ C R( I rf I ) has the form (v, x® d ) that's essentially the mutilinearization of (L/ x j) d ^'p( x ) for some 
P- 

A similar approach was utilized heavily by Grigoriev [GriOla] to prove a sum of squares lower 
bound for the knapsack problem. While for degree 4 either explicit eigenvectors or our approach 
will work, although the latter takes some more elbow grease, ours is absolutely vital for our tight 
analysis of the MPW moments for the higher degrees. We hope that such an approach will be useful 
for proving better (approaching co ~ yjn) integrality gap for Sum of Squares relaxations of higher 
degree for Planted Clique and other related problems. 

The analysis of the MPW operator at higher degrees also presents other new challenges that do 
not show up in the special case of degree 4 analyzed in [DM15]. [DM15] deal with the optimization 
version of the degree 4 SOS program which could be potentially weaker than the one we analyze 
here (and thus our lower bound is technically stronger). Working with the "optimization" version 
simplifies the analysis in [DM15] a little bit as the matrix M has entries that only have local 
dependence on the graph G. We explicitly work with the feasibility version of the degree 4 SOS 
program and thus, must deal with the additional complexity of the entries of M having a global 
dependence. As in [MPW15], we deal with this situation by separating A1 into matrices L and A 
such that L has only local dependence on the graph G. [MPW15] deal with A by a simple entrywise 
bound, however, employing such a bound yields no improvement over the bound proved in 
[MPW15] for us. It turns out that we have to do a fine grained analysis of the A matrix itself by a 
decomposition for A such that each piece is essentially only locally dependent on the graph. Once 
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we have such a decomposition, our ideas from the analysis of L can be extended to that setting as 
well. 

Finally, our argument for analyzing the spectral norms of each of the pieces of encountered in 
the decompositions also needs to be much more general than in case of [DM15] to handle higher 
degrees. For this, we identify a simple combinatorial structure that controls the norm bounds and 
allows a general hammer for computing the norms of all the matrices that appear in this analysis. 
Our proofs here are based on the trace power method and build on the combinatorial techniques in 
[DM15]. 


2.6 Preview of Technical Toolkit 


In this section we give a preview of the key lemmas that allow us to carry out the analyses described 
thus far. We have simplified some issues for the sake of exposition; details may be found in 
Section 6. 

We are concerned with the matrices in the aforementioned Fourier decomposition. Let B C [d]x[d] 
be a bipartite graph on 2d vertices. Let Qb be an (^) X (^) matrix with entries 


Qb(i,J ) = (-D 


number of B-edges which are not G-edges when the left vertex set of B is replaced by I and the right one with / 


(We are ignoring what happens if I n / ^ 0.) 



Q B ({a r a 2 ,a 3 ,a 4 },{b r b 2 ,b 3 ,b 4 }) 

number of * • • t edges 

= 1 



subgraph of G 


edge selected by B, present in G 



Figure 2: Example B and Q B where / is parity of edges, d = 4. Lemma 2.2 says that n 4 Q B = 0 and Qgrn 4 = QbLL = 0. 
Lemma 2.1 says that ||Qb|| ~ n 3 with high probability when G ~ G(n, 1/2), since B contains a 2-matching. 

This first lemma bounds the spectral norm of such a matrix in terms of the shape of B. 
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Lemma 2.1 (Informal version of Lemma 6.11 7 ). Let c be the number of edges in the maximum matching 
in B. With high probability, ||QbII = 0(n d ~ c ^ 2 ). 

We also want to show that these matrices have nontrivial kernels, so we can bound their 
negative eigenvalues against the parts of the simple moments with larger positive eigenvalues. The 
following allows us to carry out this matching of Fourier decomposition matrices to eigenspaces 
Vo, ■ ■ ■ ,V d of the expectation matrix. 

Lemma 2.2 (Informal version of Lemma 6.9). Let B( (B r ) be the subset of vertices on the left (right, 
respectively) hand side zvith non zero degrees in B. Let IT,- be the projector to Vi. Then, 

1. For every j > \B(\, 

n jQ B = 0, 

2. For every i > \B r \, 

QbFIi = 0 . 

The maximum matching cannot be too small when \Bf + \B r \ is large, which allows us to combine 
these lemmas for every B, either Qg has small spectral norm or its kernel contains the spaces where 
the diagonal of the expectation matrix is small. 

2.7 Related Work 

There's a large amount of work on understanding Linear and Semidefinite Programming based 
hierarchies. A detailed survey on the sum of squares hierarchy and references to works related 
can be found in [BS14], The earliest works on proving SoS lower bounds were due to Grigoriev 
[GriOla, GriOlb] who showed that degree Q(n) SoS does not beat the random assignment for 
3SAT or 3XOR even on random instances from a natural distribution. Some of these lower 
bounds were rediscovered by Schoenebeck [Sch08]. Lower bounds for SOS essentially rely on 
gadget reductions from 3SAT or 3XOR and this approach has been understood in some detail 
[Tul09, BCV + 12]. An exception to this methodology is the recent work of Barak et al. in proving 
SoS lower bounds for pairwise independent CSPs [BCK15]. Even though the lower bounds for 
CSPs are for random instances, the average-case nature of the problem does not show up as a main 
analytic issue. There has recently been a surge of interest in understanding the performance of SoS 
on average-case problems of interest in machine learning, both in proving upper and lower bounds 
[HSS15, BM15, MW15, GM15, BKS15]. 

For the planted clique problem, Feige and Krauthgamer gave an analysis of the performance of 
the LS+ semidefinite heirarchy tight to within constants [FK00, FK03] giving the state of the art 
algorithm for finding planted cliques in any fixed polynomial time. Other algorithmic techniques 
not based on convex relaxations have been studied and shown to fail for planted clique beyond 
co ~ y/n, most prominantly Markov Chain Monte Carlo (MCMC) [Jer92], Recently, Feldman et. 
al. [FGR + 13] showed a lower bound for (a variant of) the planted clique problem in the restricted 
class of statistical algorithms that generalize MCMC based methods and many other algorithmic 

7 We use and prove only the cases c = 1, c = 2, but the general version follows from almost identical techniques. 
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techniques. Frieze and Kannan [FK08] proposed an approach for the planted clique problem 
through optimizing a degree-3 polynomial related to the random graph. Such polynomials are NP 
hard to optimize in the worst case but the belief is that the random nature of the polynomials might 
be helpful. This approach was generalized to higher degree polynomials by Brubaker and Vempala 
[BV09]. 

There has also been recent work on variants of the problem that define Gaussian versions of the 
planted clique and more generally, the hidden submatrix problems showing, for example, strong 
inditinguishability results about the spectrum of the associated matrices with and without planting 
[MRZ14], Finally, the present work builds heavily on independent papers of Meka, Potechin, and 
Wigderson [MPW15] and Deshpande and Montanari [DM15], which we have already thoroughly 
discussed. 

Overview of Rest of the Paper. Section 3 contains preliminaries. Section 4 contains definitions 
and the necessary background on the simple moments, a.k.a. the MPW operator. Section 5 contains 
the formal definition of our corrected degree 4 moments. Section 6 lays out the technical framework 
for the analyses of the corrected degree 4 moments and for the tightened bounds on the MPW 
operator at higher degrees. Flere we define the Fourier decompositions alluded to above and carry 
out representation-theoretic arguments about their kernels. Section 7 and Section 8 use the tools we 
have built thus far to prove the main theorems. In Section 9 we prove a technical concentration 
result for small subgraphs of G(n, 1/2) required for the analysis. In Section 10 we sketch Kelner's 
argument showing that our analysis of the MPW moments is nearly optimal. 

3 Preliminaries 

We will use the following general notation in the paper. 

1. G will denote a draw from G(n, l) unless otherwise stated. 

2. ||x ||2 = ||x|| denotes the Euclidean 2 norm of a vector x e M m . 

3. For a square symmetric matrices Q,R, we write Q > R to mean Q - R is positive semidefinite. 

4. For any matrix M, ||M|| denotes its largest singular value, or, equivalently, ||M|| = 

ma x*:||x|| 2 =i IIMx|| 2 . 

5. For matrices M, N of same dimensions, M O N denotes their entrywise or Hadamard product, 
i.e., (M O N)(I, J ) = M(I, J) ■ N(I, J) for every 1, /. 

6. For a graph G and any set of two vertices of G, e, g e denotes the {-1,1} indicator of the edge e 
being present in G. That is, g e = +1 if e is an edge in G and -1 otherwise. 

7. For a set I of vertices of G, £(I) = ( 2 ), the set of all pairs from I. 

8. For a pair of subsets of vertices I, / of G, 6 ext (l, J) = 6(1 U /) \ (£(/) U 6(f)), the set of cut edges 
between I and /. 
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9. For a subspace V, Fly denotes the projector to V. 

Following [BBH + 12] and many subsequent papers, we work with SoS using the language of 
pseudo-expectations. 

Definition 3.1 (Pseudo Expectation). A linear operator E : 'P'j — » M is a degree d pseudo-expectation 
operator if it satisfies: 

Normalization: E[l] = 1 where on the LFIS 1 denotes the constant polynomial p such that p(x) = 1. 
Positivity (or positive semidefiniteness): E [p 2 ] ^ 0 for every p £ 

For every polynomial p £ 'Pj, we say that E satisfies the constraint {p = 0} if E [pq] = 0 for 
every q £ ^-dcgf/;)' The sum °f squares hierarchy can be thought of as optimizing over pseudo 
expectations (see [BS14] and the lecture notes [Barl4]). 


Fact 3.2. Let po, ...,pk € P'f If a pseudo-expectation satisfying the constraints {po = 0,... ,pk = 0} exists, 
it can be found in time n 0( - d \ If none exists, a certificate of infeasibility of these equations is found instead. 

Fact 3.3 (Special Case of Gershgorin Circle Theorem). For any square matrix M £ M NxN , 


( N 


||M|| ^ max 

ie[N] 


E ' Mi < 

\i= 1 


The following observations (actually both the same observation in different forms) will come in 
handy in our analysis. 


Lemma 3.4. Let M £ M” x ” be self-adjoint. Let Wi,..., W* be an orthogonal decomposition o/M” into 
subspaces. Let P, be the projector to W;. Let Ai,..., Ajt ^ 0 and suppose for all i, j 


PiMPi > AiPi and when i + j 


WPiMPjW ^ 



Then M is PSD. 


Proof. Consider a unit vector x £ M” and write it as x = Y,ie\k\ ^i x - We expand 
(x,Mx) = £ (x,PiMPjX) > Yj A iH p «*H 2 ~IYj 

i,je[k] i i*j 

by our assumptions on PjMPj and Cauchy-Schwarz. For each i, j, we know %(\\PiX\\Aj + |/ J ,x||A ; ) ^ 
|||P,ai|||PyX|| yjAjAj, which implies that the whole expression is nonnegative. □ 

Lemma 3.5. Let M £ M” x ” be self-adjoint. Let V\, V 2 be subspaces o/M”. Let fly. be the projector to V). If 
VA 1 A 2 ^ HriyMlTyJI, then 

n Vl Mriy + n y2 Mn yi ■< AiITi + A 2 n 2 . 

Proof. For any x £ M" we have 

2{x,Tl Vl MI\ V2 x) ^ 2||FIy 1 MT[y 2 || • Ilfly^H • ||lTy 2 x|| 

<2(V^||n Vl a:||)(V^)liny 1 x||) 

< Ailiny^ll 2 + A 2 ||ny 2 x || 2 

= A\{x, flyjX) + A 2 (x, fly,*) . □ 
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4 The MPW Operator 


In this section, we describe the linear operator E : —> M for every d C 0(log n) used by [MPW15] 

and [DM15]. It is this operator which we will show gives an integrality gap for degree-24 SoS when 
co H 1/ - ,,+1 ) / and it will also form the basis for our improved integrality gap witness at degree four. 

The main task in such a setting is to show that E is positive semidefinite. E is same as the 
operator used by [MPW15] who showed that for co ^ ©(ft 25 ) and graph G drawn at random from 
G(ft, |), E is a degree d pseudo-expectation that satisfies all the constraints with high probability 
over the draw of G. In other words, they showed that E is a 'cheating" solution that "thinks" that a 
random graph has a clique of size ~ «k 2d with high probability. 

For any set I c [»], let Xi be the monomial n ie /.r,. 

Definition 4.1 (MPW operator for clique size co). For a graph G on n vertices and a parameter co > 0 
we define a linear functional E : V'f, —> M. To describe E it is enough to describe its values on every 
monomial X\ for I c [ft], |1| ^ d. Towards this goal, for any set I c [ft], |1| ^ d, we define 

deg G (l) = |{S c [ft] : I C S, |S| = 2d, S is a clique in G }|. 


Further, we set C 2 d - C 2 [ f(C) be the number of 24-cliques in G. 
For every I c [?z], we define: 


E[xi] = 


de g G ( J ) 

C-2 d 



(4.1) 


Our definition of E is the same as the one used in [MPW15] up to normalization (we explicitly 
satisfy the normalization condition E[l] = 1). When co is chosen so that E is PSD, we often call it the 
MPW pseudo distribution. 


It is easy to check that the linear operator E satisfies the constraints in (2.1) which we record as 
the following fact: 


Fact 4.2. For any graph G, E defined by Definition 4.1 satisfies the constramts described in (2.1). 

The main task then is to show that E is PSD for appropriate range of co. This task is simplified 
by another observation from [MPW15] that we state next. 

Fact 4.3 (Corollary 2.4 in [MPW15]). For E of degree d defined in Definition 4.1, E is positive semidefinite 
ijfE[p 2 ] ^ 0 for every multilinear, homogeneous polynomial p of degree d. 

We define a matrix A\ 6 MOrf'MV) such that 1, J £ (^), 


M(I,J) = deg G (I U /) 



(4.2) 


Then, from the fact above, showing that E is PSD is equivalent to proving that ^yVl is PSD. The 
goal of the next section is to establish that with high probability over the draw of G ~ G(u, 1 /2), 
At is PSD for co <C 0(nd+i ). This immediately also shows that A1 is PSD with high probability 
completing the proof. 
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Theorem 4.4. With probability at least 1-1 /n over the draw of G ~ G(n, |) and d = o( J log («)) A1 
defined by Definition 4.2 is PSD whenever w ^ 0(n3+r). 

Our analysis improves upon the analysis in [MPW15] and generalizes the improved analysis 
for the special case of d = 2 done in [DM15]. By a generalization of the counter example due to 
Kelner, our analysis can actually be shown to be tight. We defer the details of the counter example 
to the full version. In the remaining part of this section, we begin the task of proving Theorem 4.4 
by introducing certain simplifications and computing the eigen values of the expected value of the 
matrix under G(n,j). 

4.1 Reduction to PSDness of At' 

The presence of some zero rows in At (corresponding to index sets S that are not cliques in G) 
poses a problem in analyzing its spectrum. As in [MPW15], we evade this issue by working with 
At' £ R( I rf I ) x ( 1 rf I ) obtained by filling in the zero rows of At while not affecting the non zero rows of At. 
Since the non zero part of At (for any G) is a sub matrix of At', proving PSDness of At' is enough. 
We describe At' next and begin by setting up some notation towards that goal: 

For any 0 ^ i ^ d, let 



Definition 4.5 ("Filled-in" matrix). Let Atj € be defined by Atr(f, /) = /i(|/ n /|) whenever 

IU/cT and £(T) \ (£(!) U £(/)) c E. We define the filled in matrix At' as: 

At'= £ M t . 

T:\T\=2d 

Observe that for any I, /, At'(l, /) is chosen so as to depend only on the edges with one end point 
in I and the other in /. Intuitively, this corresponds to thinking of I and / as being cliques in G by 
addition of some edges. Moreoever, At' (I,/) is chosen so that Al'(/, /) = M(I,J) whenever I,J are 
actually cliques in G. Thus, as noted above, we have the following fact (which is Lemma 5.1 in 
[MPW15]). 

Fact 4.6 (Lemma 5.1 in [MPW15]). At is PSD if At' is PSD. 

To analyze At' we decompose into two parts initially writing At' = E+D where E = E G ^ n q [At']. 
We show that E is PSD with all eigenvalues bounded away from 0 in the next subsection following 
which we analyze the deviation D = At' - E by writing it as a sum of various pieces and decomposing 
the action of each piece along the eigenspaces of E in Section 7. 

Thus, the following lemma completes the proof of Theorem 4.4. 

Lemma 4.7 (At' is PSD). With probability at least 1-1 /n over the draw ofG ~ G(n, \)for d = o( yj log (n)), 
At' defined by Definition 4.5 satisfies At' > 0 whenever co ^ Q(n^+ t). 
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4.2 The Expectation Matrix 

The minimum eigen values of the expectation matrix E = E[AT] was analyzed in [MPW15] via 
known results about the Johnson scheme matrices. The same proof also yields all the eigenvalues 
of E which we note here. 

We first describe the entries of the matrix At'. 

Fact 4.8 (Entries of At', see Claim 7.3 in [MPW15]). For every I,j£ (^) and E = E[M'], 



Next, we need a basic fact about the (shared) eigenspaces of all the set symmetric matrices, in 
particular, their number and dimensions which follows from the following well known result from 
classical theory of Johnson schemes. 

Fact 4.9 (Lemma 6.6 of [MPW15]). Fix n,d ^ n/2 and let ff - ff(n,d) be the set of all set symmetric 
matrices in m( d ) x ( 2). Then, there exist subspaces Vo, V\, ..., V& € r( ) that are orthogonal to each other 
such that: 

1. Vo, V\, ... ,Vd are eigen spaces for every J e ff and are isomorphic to distinct irreducible representations 
of the symmetric group S„ (See Section 6 for definitions). 

2. For 0 < ; < d, dim{Vj) = (”) - (£). 

Using a nice basis for the matrices in ff, one can obtain the following estimates of the eigenvalues 
of E on Vj for each 0 ^ i ^ d: 

Lemma 4.10 (Eigenvalues of E). Let co < and d ^ cu/2. Let Aj(E) be the eigenvalue ofE on Vj as 
defined in Fact 4.9 . Then, 

^ 4■ (" ■■ T f 1 ■ g ■ ^ - ("’) - (t;) > ^ 

5 The Corrected Operator for Degree Four 

In this section, we present the pseudodistribution that we will use to show an almost optimal lower 
bound on the degree 4 SOS algorithm. Our pseudodistribution is obtained by "correcting" the 
one described in the previous section. The correction itself is inspired by an explicit polynomial 
described by Kelner who showed that the pseudodistribution from the previous section for degree 
4 does not satisfy positivity for co » n 1 ^. 

We now lay some groundwork for defining our modified operator. In the following, we will 
always work on a fixed graph G on [n\ and use Eq to denote the MPW pseudoexpectation operator 
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for d = 2. We start by defining a specific neighborhood indicator vector for every vertex in G. For a 
vertex s 6 G, let the vector r s e M” be given by 


h(j) = 


if s~ j 
if s ^ j 
if s = j 


Next, we define the additive correction X to the MPW pseudoexpectation operator E. X will be 
a linear operator on the space of homogeneous degree 4 polynomials. Because of linearity, it is 
enough to define X on the basis of all monomials of degree 4. 


Definition 5.1 (Correction Term). Let y > 0 be a real parameter to be chosen later. Let X be the 
linear operator on the linear space of homogeneous, multilinear polynomials of degree 4 such that: 

_ il'Cfy Ls r s(i)r s (j)r s (k)r s ({) if b j,K£ form a clique in G 

J_\XjX j%i-X (\ — < 

10 otherwise 

The following is easy to prove. 

Fact 5.2. For co > 0 and c <sc co, there exists x,x = co± 0(c/cu 3 ), such that Q) = Q’) + c. 

We now go on to define the corrected moments E : P" —» M. 

Definition 5.3 (Corrected Pseudoexpectation). We first use the correction operator X = Xy to define 
the corrected moments on all multilinear monomials of degree 4. For every S c [n], |S| = 4, we set: 

E[x s ] = E 0 [x s ] + X[x s ]. 


Next, we want to extend E to all the monomials so that E[l] = 1 and E[X, l] = co' for some co' ~ co. 
Towards this, we let c = Xs : s is a 4-clique in G -C[- T s] • Then, observe that: 

E ^ x s = ^ E 0 x s + £x s = M + c 

S:S is a 4-clique in G S:S is a 4-clique in G ' ' 

Then we know there exists co' = co ± 0(c/cu 3 ) > 0 satisfying (^) d = f Q) + c (using Fact 5.2). Thus, the 
degree 4 moments we defined "think" YLi x i - co'. We use this relationship to extend the definition 
to all the monomials. For every S C [n], |S| = 3, 



®[xs] - , IE[x S ud 

us 

Similarly, for each S: |S| = 2, 

£[^s] = , n y\ E[x S ud 

co' - 2 

us 

Finally, we set 

E[-C] = ( T E[x,Xy] 

co' ~ 1 

Hi 

and E1 = 1. 
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Theorem 5.4 ( Theorem 1.1, formal). Let G ~ G(n, 1 /2). T/iere z's oz = Q( yfn/ polylog n ) so hzzzt zoz'f/z 
probability 1-1 /n f/ze operator E of Definition 5.3 is a valid a degree-4 pseudo-expectation satisfying (2.1) 
for d = 2. 

It is not hard to show that E of Definition 5.3 satisfies the constraints in (2.1) and the correction 
above doesn't change co by a lot. We defer the proofs to Section 5.1. 

Lemma 5.5. Let E be the degree-4 corrected moments for clique size co (Definition 5.3). Then, there is co' 
such that E satisfies 

lx? = Xj} ie[n] , {xiXj = 0} HmG , {£xi = a/}. 

i 

Furthermore, ifG ~ G(n, 1/2), then with probability 1 - 0(zt -25 ), | co' - co\ < 0(y log(zz) 2 oz 2 /zz 5 ^ 2 ). 

Thus, to show Theorem 5.4, the remaining task is to show that E satisfies positive semidefiniteness. 
Using Fact Fact 4.3, it is enough to show that E[p 2 ] ^ 0 for every homogeneous, multilinear 
polynomial p of degree 2 . This is equivalent to showing that the matrix N' 6 r( I 2 1 ) x ( I 2 I ) defined by 
N'(I,J) = E[x/u/] is PSD. Thus, to complete the proof of Theorem Theorem 5.4 we will show the 
following lemma which is the most technical part of the proof. 

Lemma 5.6. There is coo = V^/ P°lyl°g n ) an d y = ©(1) so that for co ^ coo, with probability at least 

1-1 /n over the draw ofG~ G(n, \), N' > 0. 


5.1 Technical Lemmas and Proofs 


We proceed here to show that E from Definition 5.3 satisfies the appropriate constraits. We will 
need the following lemma giving concentration for certain scalar random variables, including the 
extent to which the correction changes the (pseudo)-expected clique size when G ~ G(zz, 1/2). 

Lemma 5.7. Let G ~ G(zz, 1/2). Let the vectors r s e W 1 be as in Definition 5.3. There is a universal constant 
C so that with probability 1 - 0(zz - - 5 ), 


1 . 


Yj Yj r s(j)r s (j)r s (k)r s (f) 

s i,j,k,C a clique 


^ Crt’i 2 log(zz) 2 . 


2. For every i, j, k distinct, 


Y Y r s (i)r s (j)r s (k)r s (£) 

s t in a clique with i, j, k 


^ Cn log(zz). 


3. For every i, j distinct and every s, 


Y r s(i) r s(k) 

k in a clique with i, j 


^ Cyjn logzz. 
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Proof. We prove the first item; the others are similar. 

The proof is by several applications of McDiarmid's inequality. By a standard Chernoff bound 
there is a universal constant Co so that for every s 6 [n] and every i, j, k € [n], with probabilty 
1 - 0(h _4 °), 


Y m 

(in a clique with i, j, k 


^ Co yjnlogn. 


Call Ei the event that this occurs for every s, i, j,k. Clearly P(E] ) ^ 1 — 0(n 36 ). 

Now for every s, i, j e [n], we apply McDiarmid's inequality to | in a clique with i, j r s (k)r s {()\. We 
truncate to get rid of the bad event ->Ei. For a graph G, let/(G) = Lit/ in a clique with i,j r s (k)r s (€)i£E 1 
occurs for G and /(G) = 0 otherwise. Now consider any pair of graphs G, G' differing on a single 
edge (u,v). It is straightforward to show that if {u,v} n {s,i,j} = 0 then |/(G) - f(G')\ = 0(1), while 
otherwise 


1/(G) - f(G')\ ^ 


E 


r s {() 


C in a clique with i, j, k 


^ Ci \]n log n 


for some other universal constant Ci. So by McDiarmid's inequality there is Ci so that with 
probability 1 - 0 (n -34 ), 


r s (k)r s (£) 

k,{ in a clique with i, j 


^ C2ttlog?J . 


By a similar argument there is C 3 so that for every s, i <E [n], with probability 1 - 0(n 30 ), 


Yj r s (j)r s (k)r s ({) 

j,k,{ in a clique with i 


^ C3H 3 ^ 2 (logH) 3,,2 . 


Let £2 be the event that this bound holds for every s, i 6 [n]. So P(E 2 ) P 1 — 0(n~ 27 ). Then, letting 
/'(G) = Yj S H i,j,k,f a clique r s(i)rs(j)r s (k)r s ({) if E 2 occurs for G and 0 otherwise, we get that on graphs 
G, G' differing on an edge (it, v) 

1 9 (G) - g(G') | ^ C 4 h 3/2 log(fz ) 3/2 


for some other constant C 4 . The result follows by a final application of McDiarmid's inequality (we 
lose a factor of n at this step as opposed to the \fn at previous steps because there are « n 2 edges to 
be revealed). □ 


We can now complete the proof of Lemma 5.5 using Lemma 5.7 and Fact 5.2. 

Proof of Lemma 5.5. The functional E satisfies the constraints {.r 2 = *;}j 6 [„], {zpq- = 0},^, m c by 
construction. Let co' be as in Definition 5.3. It is routine to check that for p(x) homogeneous of degree 
1,2, or 3 that Ep(x) J/ x i = (l> ' E p(x) by definition, so it will be enough to check that E X,, x, = co'. 
Recall that co' satisfies co'(co' - 1 )(co' - 2 )(co' - 3) = 4! • E JLi,j,k,t a 4 clique x i x j x kXf- Now we expand: 

t H xi=t ^iTuTj xix > 

i i j±i 
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E 


(co' - 1 )(co' - 2 ) 


Y XiXjX k 


i,j,k 

all distinct 


(co' - 1 )(co' - 2)(co' - 3) 
1 

(co' - 1 )(co' - 2)(co' - 3) 
= co'. 


Y XiXjXkXe 


i,j,k,{ 
all distinct 


■ 4! ■ t Y XjXjX k X( 

i,j,k,C a 4 clique 


It remains just to show our claim on \co - co' \ . By our choice of co' and the guarantees of Fact 5.2, we get 
that \co - co’\ ^ | X Yji,j,k,t a 4-clique x iXjX k X{\/co 3 where X is the correction operator from Definition 5.1. 
By Lemma 5.7, this is with probability l-0(n -25 ) atmostO(yo > 2 log(n) 2 /n 5 ^ 2 ) when G ~ G(n, 1/2). □ 

In Section 6 , we develop some general tools for analyzing the matrices that we encounter before 
going on to prove Theorem 4.4 and Lemma 5.6. 


6 Tools 

In this section, we build some general purpose tools helpful in the analysis of the matrices of interest 
to us. We give statements and proofs that are more or less independent of the rest of the paper with 
an eye towards future work on planted clique and related problems where one deals with random 
matrices with structure dependencies. The first three sections focus on building an understanding 
of the symmetries of the eigenspaces Vo, V\, ..., of set symmetric matrices on r(V) x (V). The 
last section uses moment method with some combinatorial techniques to obtain tight estimates of 
spectral norm for certain random matrices with dependent entries. 

6.1 Background on Representations of Finite Groups 

We provide background in the required tools from basic representation theory below. 

Definition 6.1 (Representation). For a finite dimensional complex vector space V, let Hom(V, V ) be 
the set of all linear maps from V into V. For any finite group G and n \ G —> Hom(V, V), the pair 
(n, V) is said to be a representation of G if 7z satisfies, for any g\, cj 2 £ G, 

n(g x ■ g 2 ) = n(g k ) ■ n(g 2 ), 

where the on the LHS corresponds to the group operation and on the RHS, the composition of 
linear maps on V. When the map n is clear from the context (as some natural action of the group G 
on V), we abuse notation and just say that V is a representation of G. 

Let (tc, V) be a representation of a group G. A subspace W c V is said to be a subrepresentation if 
for every w € W, n(g)w e W for every g e G. That is, W is a stable or invariant subspace for all the 
linear maps n(g), one for each g 6 G. Observe that in this case, (tl, W) is another representation of G. 
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A representation (n, V) of G is said to be irreducible if for any subspace W invariant under all the 
linear maps n(g) for g e G,W - V or W = {0}. 

Every representation V of G can be decomposed as a direct sum of subspaces each of which is 
an irreducible representation of G. Further, for any finite group G, there are at most |G| distinct 
irreducible representation up to isomorphism. For well studied finite groups such as the symmetric 
group on n elements S n , the set of irreducible representations are well known and well studied. The 
power of representation theory in the present context comes from understanding the eigenspace 
structure of linear operators that are invariant under some action of the group G (in our case §„). 

There is a natural linear action of the permutation group §„ on P q for any q, denoted by 
71 : S> n —^ Hom(P q ,P q ): A permutation o e §„ when applied to a vector v £ P q produces the vector 
v' € P q such that v\ - v a g) for every I £ This can be alternately described as multiplication 
by the permutation matrix associated with a. Observe that (a i • a 2 ) ■ v = 0 \ ■ (02 ■ v) and thus, 
( n,P q ) is a representation of §„. It is known (See Section 3.2, [BI84]) that under this action, P L \ 
can be decomposed as direct sum of subspaces Vq, V\,... ,V q such that each V, is an irreducible 
representation of S n and none of V u V j for i +■ j are isomorphic to each other. 

The expected moment matrix E = E[AT] is set symmetric and therefore commutes with the 
action of §„ on P L \ described above. This can be easily used to obtain that E has the eigenspaces 
Vo, V\,... ,V L f discussed above. We need the following consequence of a basic representation- 
theoretic result. 

Fact 6.2 (Consequence of Schur's Femma [Serl2]). Suppose (n , V) and (n', W) are representations of a 
group G. Suppose L : V —> W is a linear map such that for any g e G and v e V, 

L(n(g) • v) = n'(g) ■ L(v). 

Then, for any irreducible representation Vi c V under n, L(Vj) c W is an irreducible representation in W 
under n'. 

6.2 Eigenspaces of the Set Symmetric Matrices 

We often encounter random (") x (f) matrices M indexed by subsets of [n\ of size d. For example, a 
common feature in our setting (as observed in [MPW15]) is that E - E[M] depends only on 11 n J\. 

Definition 6.3 (Set Symmetry). A matrix A £ RC'rfVCV) is said to be set symmetric if for every 
S, T, S', T £ ( [ J) such that |S (T T\ = | S' n V |, A(S, T) = A(S' r V). 

The set of all set symmetric matrices is known as the Johnson scheme in algebraic combinatorics. 
All such matrices commute and thus share eigenspaces. While the matrices in the Johnson scheme 
are well studied, the description of the eigenspaces in the literature is hard to use for the purpose 
of our proofs. We thus take a more direct approach and use basic representation theory in what 
follows to identify a simple symmetry condition on the eigenspaces of set symmetric matrices 
which will be useful to understand the spectral properties of the matrices we study. 
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Lemma 6.4. Let Vo, V\,..., V& Q M^) be the eigenspaces of set symmetric matrices on described 

in the previous section. For any u e let v e r( a) for d^ t be defined so that for each I e (^), 

vi= Y u r . 

i'Qi,\r\=t 

Then, v e Vo © V\ © • • • © V t . 

Proof. Let P q for any positive integer q be the space of all vectors indexed by elements of (^). 
Consider the standard action of S n on [»] that sends i —> o(i) for any a 6 §„. This induces a natural 
action on (^') where any I £ (^) is sent to o(I) = { / | 3i € I such that o(i) = /}. This further induces 
a natural action on P q by taking v = {n} / 6 ([«]) and sending it to v' where v'j = ~o 0 -qj) for every I. 
A quick check ensures that the action defined above satisfies a o z(v) = o(z(v)) for any a, t € §„. 
Thus, V q is a representation of §„ under the action defined above for any q. It is easy to check 

that left multiplication by any set symmetric matrix from M v i ’ ' i' commutes with the action of 
§ n defined above. From Fact 4.9, the eigenspaces of any set symmetric matrix acting on Pq are 
given by Vq, V\,... ,V q such that dim(V,) = (”) - (.” 1 ). By Fact 4.9 each V, is isomorphic to distinct 
irreducible representations of §„. 

Next, consider the map C : Pt —> Pd such that for any h£R( '), the value C(u) is given 

by v such that v\ = Lpci, \r\=t u i- Then, C is linear and we claim that C commutes with the action of 
S n defined above: o(C(u)) = C(o(u)). Note that on the LHS, o refers to the action of S„ on P L \ while 
on the RHS, it refers to the action on P t . We follow the definition to verify this: 

(o(C(u)))j = o( Ur) = Yj U a~Hn = Yj Uv = C ^ u ^ h 

i'ci,\r\=t roi,\r\=t i'Qo(i) 

Suppose u £ Vl for i t where V 1 . is some eigenspace of a set-symmetric (”) X (") matrix. Then, 
by Lemma 6.2 C(V|) is an irreducible representation of §„ and is thus an invariant subspace for the 
action of S„ in Pj. By a dimension argument, C(W) = V,. Thus, C(u) £ V,. 

□ 


6.3 Kernels of Patterned Matrices 

In this section we design some general tools to understand the spectral structure of matrices that 
have restricted variations around the set symmetric structure discussed in the previous section. The 
main tool we will use to establish these results is Lemma 6.4 shown in the previous section. Before 
moving on to this task, we describe a high level overview of what we intend to do. The following 
paragraph can be skipped to dive directly into the technical details without the loss of continuity. 

The study of the eigenspaces of set symmetric matrices lets us completely understand the 
spectral structure of the expectation matrix E. In the next section when we analyze the spectrum 
of AT, we will encounter matrices that depend on the underlying graph G and thus are not set 
symmetric. However, if the dependence on the underlying graph G is in some sense limited, we 
hope that some of the nice algebraic properties that set symmetry grants us should perhaps continue 
to hold. In our case, we will be able to decompose E into various pieces and for each of these pieces. 
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the entry at ( I,J ) has dependence on the graph G based only on the status (edge vs no edge) of a 
small number of pairs (i, j) £ / X /. The goal of this section is to develop tools to understand certain 
(coarse) spectral properties of such matrices. 

Our aim is to study matrices in Q = Q(G) £ r( a ) x ( j ) for a graph G on [d] such that Q(I,J) 
depends on a) the intersections between I and / b) the values of t)i, (the edge indicator of G) for 
pairs b of vertices (from the non intersecting parts of I and /). We first develop some notation to talk 
about such matrices. 

Next, we define patterns: 

Definition 6.5 (Pattern). For Zf, Z r C [d], let S/ (r z r be the set of all non-empty bipartite graphs 
on left and right vertex sets each given by [d] \ Zf on the left and [d] \ Z r on the right. Define 
& = U\z e \,\z r \=q£>z e ,z r - Then, a tuple (B, Zf, Z r ) for B e &z e ,z r is said to be a q pattern where 
q = \Z(\ = \Z r \. When q = 0, we call B itself a pattern. 

For any set I, J, consider the "sorting maps" Cl : [d] —> I, Qj : [d] — > J i.e, C/0) is the least element 
of I, Ci( 2), the next to the least and so on. We can extend C\, C j to subsets of [d] in the natural way. 
Let B( (B r ) be the subset of vertices on the left (right) hand side with non zero degrees in B e &z e ,z r - 
For any I,] £ (^), there is a natural map that takes B and obtains a copy of B on vertex sets I and 
/, via the sorting maps Cl and Cj from above: Cij(B) is the bipartite graph on I, J with the edges 
obtained by taking every edge b = {/,;} € B and adding the edge {0(0/ C/0)} to G,/(£>)- 

We need to understand the effect of applying a permutation o £ to (B, Zf, Z r ) for B ~ Sz ( ,z r - 
Let a £ Srf be a permutation on [d]. Given (B, Zf, Z r ), a has two natural actions. The left 

def 

action of a on ( B, Zf, Z r ) produces o o ( B, Zf, Z r ) = (a o B, o(Zf), Z r ) where each edge (/, j) £ B is 
sent into ( o(i),j) in a o B. We similarly define the right action of a on ( B,Zf,Z r ) that produces 

def 

( B,Zf,Z r ) o a = (Bo o,Zf,o(Z r )). Each of these two actions defines a subgroup that leaves B 
invariant. 

Definition 6.6 (Automorphism Groups). Let B £ tBz,z> he a labeled bipartite graph. We define the 
left automorphism group of (B, Zf, Z r ) as 

Autf(B,Zf,Z r ) = {o £ | o o (B,Zf,Z r ) - (B,Zf,Z r )}, 

and the right automorphism group of B as 

Aut r (B,Zf,Z r ) = {oe§ d \ ( B,Zf,Z r ) o o = (B,Zf,Z r )}. 

Next, we define equivalence classes of the patterns ( B,Zf,Z r ). 

Definition 6.7 (Similar Patterns). For patterns ( B,Zf,Z r ) is left similar to (B', Zf Z' r ) and write 
(B, Zf, Z r ) ~~{ (B', Zf Z’ r ) if there exists a o e 8 d such that a o (B, Zf, Z r ) = (B', Zf Z' r ). Similarly, we 
say that ( B,Zf,Z r ) is right similar to ( B',ZfZ' r ) and write ( B,Zf,Z r ) ~ r ( B',ZfZ' r ) if there exists a 
o £ S d such that (B', Zf Zf = (B, Zf, Zf) o a. 

We are now ready to define patterned matrices: 


22 



Definition 6.8 (Patterned Matrices). Let (£>, Z(, Z r ) be a zj-pattern. Let / : {-1, 1} B — > M be a function 
that maps a {-1,1} labeling of the pairs in B to M. For a graph G on [zz] vertices, the patterned matrix 
with pattern ( B , Zf, Z r ) defined by / is a matrix in Q = Q,B,z ( ,z r ,f(G) € mWMV) such that 


Q(J /) = I fWbeQjiB)), for every I, J Q,(Z { ) = Q(Z r ) 

10 , otherwise. 

When q = 0 , we write Qb,/ for the corresponding patterned matrix. 

The following result describes the kernels of certain symmetrized sums of Qb,z,/ and is the main 
claim of this section. 

Lemma 6.9. For graph G, a q-pattern (B,Zf,Z r ) and f : {-1,1} B —> M, let Q - Q,B,z ( ,z r ,f(G) £ m(V) x (V) 
he the corresponding patterened matrix. Define the left and right symmetrized version of Q by: 

Q C = ^ Qb' ,Z' ( ,Z' r ,f, 

{B’,Z' ( ,Z' r )~ e {B,Z e ,Z r ) 


and 

Q r = X Qb' ,Z(,Z r ,f > 

(B' ,Z' ( ,Z' r )~ r {B,Z(,Zr) 

respectively. Let B( (B r ) be the subset of vertices on the left (right, respectively) hand side with non zero 
degrees in B. Then, 

1. For every j > |Bf| + q, 

n )QW = °< 

2. For every i > \B r \ + q, 

Q r B,Z,f U i = °- 

Proof. Observe that Q BZ[Z f - Q B , z , z , jr for any (B, Zf, Z r ) (B' , Z', Z' r ). This motivates us to first 

obtain a more symmetric looking expression for Q [ and Q r . Let utf((B, Zf, Z r )) (and corre¬ 
spondingly, E>d/Aut r ((B, Z(, Z r ))) be the group of left (right) cosets of Aut(((B, Z{, Z r )) ( Aut r ((B , Z(, Z r )) 
respectively). We have: 


Q [ ~ ^ QB’,Z' ( ,Z' r ,f 

(B,Z(,Z r )~ t {B',Z' ( , Z' r ) 


YL Qzo(B,Z e ,Z r ) 

reSj/Autf((B,Zf,Z r )) 


1 


\Autc((B,Z{, Z r ))\ 


treS d 


ao(B,Z t ,Z r )- 


Similarly, we have: 


Q 


B,Z(,z r ,f \Aut r ((B,Z{, Z r ))\ 


Q(B,Zf,Z r )oa,f ■ 

aeS d 


We now begin the argument for proving the first claim. The second claim has an analogous proof. 
Consider an arbitrary v = {z;/} ( /[„k e m(V). We will show that Q e v e Vq © V\ © • • • © V q+ \ b ( \. Towards 
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this goal, define a vector u = {ur } Te (£ R(V) as follows: for each T £ (^), let Ij £ (^) be arbitrary 
subject to the constraint that Ij 2 T. We define: 

Uj = E E Vjf({gfb}bet, Ti ,(B'))- 

(B',Z' ( ,Z' r ) J: C;(Z,)=C lT (Z') 

~AB,Z{,Z r ) 

B'uz'=r 

We first show that « above is well defined in that the definition does not depend on the specific 

subset It used so long as It 2 T. We adopt the notation (that ignores the "direction" of action) 

def 

B a - a o B only for the calculations that follow. 

Claim 6.10. Fix T £ ( ) and let !\,h £ (^) such that T c I lr I 2 . Then, 

E E VjfifjbUeC h ,,(B')) ~ E E Vjf({fJb}bEL,i 2j i(B'))- 

(B',Z' e , Z' r ) J: C,(Z r )=C h (Z' { ) (B',Z' t ,Z' r ) /:C;(Z r )=Ci 2 (Z') 

~t(B,Z(,Z r ) ~[(B,Z{,Z r ) 

B' ( UZ' ( =T B'UZ'=T 


Proof of Claim. We can equivalently write the claim above as: 

E E h ,,m) - E E V J ■ f({ffb}beQ 2 j(B^)) (6-1) 

creS rf /:C;(Z,)=C/ 1 (a{Z t )) oeB d J:t,(Z r )=Q 2 (< j(Z { )) 

We start with the LHS and observe that for any t € it equals: 

E E V J •/({^fc}b 60 1 j(B ffOT ))- 

creS d /:C / (Z r )=Ci 1 (c70T(Zf)) 


We show that there exists a t such that C/, (o o t(Z^)) = C/ 2 ( cr (-^^)) an d ° T ° B) = C i 2 ,j(° ° &)■ 

For each i e I', let If e [d] be such that Cif bj) = i. Similarly, for each i £ I', let bj £ [d] be such 
that C h(b]) = i- For each i £ /', choose t such that t(ZF) = Zd. Then, C/, (t(F?)) = C h(bj) = z. Thus, 
Ci 1 j(B n ) = C,i 2r j{B’) for every (B', Z', Z() (B, Z(, Z r ). ' □ 

We can now show that (Q c v)i - Erc/,|/'|= i; +|/p| 11 /' • We now observe: 


(qH= X 

M [ ?) 

= H« M h,z„z,)|EE«'Wi 


Keeping Js that correspond to non-zero entries in Qb’,z' ,z r f 


1 

|AMf;(B,Zf,Z r )| 


E E Qb’ ,Z' t ,Z’ r ,f v I 

aeS d J: C 7 (Z,)=C 7 (Zp 


1 

|Awf^(B,Z^, Z r )| 


E E /({'7b}foeCi,/(B ,J )) z; / 

aeS rf /:C/(Z r )=Ci(Zp 


( 6 . 2 ) 


(6.3) 
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(6.4) 


Using that U/'ci {o \ Ci(B° e ) U a(Zf) = T} forms a partition of indexed by /', 

1 » 

= E |A«t f (B,z f ,z r )l E 

I'cl' U/ oeS, d ,B°Uo(Z e )=I',J:Q(Z r )=Q(Z' { ) 

Since each (B',Z' f ,Z' r ) ( B,Zi,Z r ) s.t. B^. U Z' ( = I' occur |Auf^(B, Z(, Z r )\ times in inner sum, 

=E E E / ({Fb}&6Ci(B'))^7 

Z'CT (B',Z' ( ,Z' r )~t(B,Z(,Z r ), B'UZ'=7' /:C;(Z r )=Ci(Z') 

Using Claim 6.10: (6.5) 

= ^wj'. (6.6) 

I’cl 


This completes the proof using Lemma 6.4. 


□ 


6.4 Concentration for Locally Random Matrices over G(n, |) 

The goal of this section is to prove strong concentration bounds for the matrices will encounter in 
our analysis. The first result is a spectral concentration bound for the patterned matrices Q = Qbj(G) 
for G ~ G(n, \) when / : {-1,0,1} B —> M is given by f(x) = IJ^gXi,. In other words, the entry Q(l, ]) is 
the product of the edge indicator variables g\, for b e Ci,/(B). These bounds will be used in Section 7. 

Lemma 6.11. For d ^ 2, d = 0(log (n)), and a bipartite graph B e S, let Q - Qbj be a patterned matrix 
with f{x) = E lbeB x b- That is, 

Q(u , 

10 otherzvise 

Then: 

1. When B contains a 2-matching, then P(||Q|| ^ n d_1 (logu) 3 ) ^ 0(n -10 ). 

2. When B is not the empty graph, P(||Q|| ^ n d_1 / 2 (logu) 3 ) ^ 0(n -10 ). 

The next main result of this section considers a different class of matrices that appear in the 
analysis in Section 8. 

Lemma 6.12. Let U c [2] x [2] be a bipartite graph on 4 vertices and suppose U is nonempty. Let 
M € (^) x (^) be a matrix with the entry at {fli,a 2 }, \b^,b 2 \ for a\ ^ a 2 and b\ ^ b 2 are given by 


M[{a 1 ,a 2 },{b 1 ,b 2 }] 


\ E ke[n] 9^fl\9kfi29Lt>\9k,l>2 IT (i,j)eU 9cii,bj if\{ai,a 2 , b\, b 2 }\ —4 
lo otherwise 


Recall that we set y aa = 0 for every a e [n\ by convention. Then, whenever U is non-empty, P(||M|| ^ 
n 3 / 2 (logn) 3 ) ^ 0(n -10 ). IfU is the empty graph, then P(||M|| ^ H 2 (logn) 3 ) ^ 0(n~ 10 ). 


25 



The proofs of both these results are based on the standard idea of analyzing the trace of higher 
powers of a matrix to prove bounds on its spectral norm. The proof of Lemma 6.11 is similar to the 
proofs via the trace power method for bounding the norms of matrices as presented in [DM15]. The 
general format we present here will come in handy for multiple applications to various matrices in 
Section 7. Lemma 6.12 deals with somewhat more complicated matrices that appear in the analysis 
of the corrected operator for degree 4 lower bound. Nevertheless, as is common in such proofs, the 
analysis is based on a combinatorial analysis of the terms that make non zero contribution to the 
trace powers combined with the simplifying effect of random partitioning based arguments. We 
describe the details of the proof in the following section. 

6.4.1 General Tools 

Before diving into the details, we present three general purpose tools that we will employ repeatedly 
in our analyses. For analyzing the spectral norm of a matrix Q € the first tool allows us 

to analyze instead a related matrix Q' e M" x ”'. That is, instead of rows and columns being indexed 
by subsets of vertices as in Q, Q' has rows and columns indexed by ordered tuples of vertices of 
size d. This transformation is not hard as one can find Q as a principal submatrix of Q'. 

Lemma 6.13 (Sets to Ordered Tuples). For any Q e RCiMV) define the matrix Q' e R ndx,,d 
such that for any ordered tuple S = {a\,a 2 ,... ,a d ),T = (b\,b 2 ,• ■ •, b d ) € [n] d , Q'(S,T) = 
Q({a\,a 2 ,... ,a d },{b lr b 2 , ■ ■ ■ ,b d }). Then , ||Q|| < ||Q'||. 

Proof. It is enough to show that Q' occurs as a principal submatrix of Q. For this, take the submatrix 
of rows and columns of M indexed by tuples {a \,..., a d ) in sorted order, i.e., with a\ < a 2 < ... a d . □ 

We will use the following lemma to break dependencies in certain random matrices by 
decomposing them into matrices whose entries, while still dependent, have additional structure. 

Lemma 6.14 (Random Partitioning). For d eN, let Q € R ndxnd . 

1. Suppose Q(I, ]) = 0 when 1 n / + 0. Let (S|,..., S*),..., (Sy ..., S’f) be a sequence of partition of [n] 
into k bins. Each partition induces a matrix based on Q as follows: 

Qi[{a\, ■ ■ ■ ,af), (b lf ...,b d )] 

Q[{ai,... ,a d ),{bi,... ,b d )] ifaybj € St for j <k 

and ay, bj e S^for j ^ k 

and for all i' < i, M v [(fli,..., a d ), (b lr ..., b d )] = 0 
0 otherwise 

Then, there is a family of partitions (S'j,..., S £),..., (SJ,..., S r k ) such that Q = 1 Q, with r ^ 

0(k k \ogn). 

2. Let Qi 6 R ndxnd , for each 1 ^ j ^ n, be matrices such that Qfl, J) = 0 whenever I n / + 0 or 
j 6 I U /. Suppose Q = ZdJ=i £° r a partition (Si,..., S/ c , T) of [n\ into k + 1 parts, say that 
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j, (fli,... ,a d ), (b i,... ,b d ) respect the partition if j £ T, ai, bj £ Sifor all i. Let a sequence of partitions 
(S],..., S k , T 1 ),... , (S!j, ... ,S r k , T r ) of [n\ into k + 1 parts induce matrices Mj in the following way. 

Qi[(a 1/ ...,a d ),(bi,...,b d )]= ^ Q’[(a 1 ,...,a d ),{b\,...,b d )] 

. b d ) 


where ^ ^ ^ is the set of indices j so that j, (a\,... ,a d ), {b\,... ,b d ) respect the partition 

(S), ...,S' k , T 1 ) and do not respect any partition (S'",..., S k ,V')for any i' < i. 

Then, there is a family (S|,..., S^, T 1 ),..., (Sf ..., S' k , T r ) of partitions of [n] so that Q = Qi with 
r ^ 0((k + l) k+1 log n). 

Proof of Lemma 6.14. We present the proof of 1; the proof of 2 is almost identical. For r to be chosen 
later, we pick partitions (Sj,..., S k ),..., (Sf... ,S r f) uniformly at random and independently so that 
each is partition of [n] into sets of size n/k each. 

Call (fli,..., a d ), (bi,..., b d ) good at step i if aj, bj e S', for every j < k and aj, bj e S’ k if j ^ k. It is 
enough to show that after r 0(k k log n) steps the probability that every [fl |,..., a d , b\,..., b d ] of 
size 2d is good at some step i ^ r. 

Fix some {a\,... ,a d ),{b\,... ,b d ) with \{a\,... ,a d ,b\,... ,b d )\ = 2d. It is good at step i with 
probability at least k~ k . Since the steps are independent, after r steps 


1 \r 


P((fli,..., a d ), b d ) is good) ^ (1 - f k ) 


(d - hfY 1 * 


k k ' 

1 y/k k 
e' 


which is at most 1 /n wd for some r = 0(k k log n). 

Taking a union bound over all 0(n 2d ) tuples (a i, ..., af), (b \,..., b d ) with \{a\,. .. , a d , b\, . .. , b d }\ = 
2d completes the proof. □ 

Finally, the following lemma relates the norms of certain matrices in X £ MCrfMrf 1 ) that have 

non zero entry ( I , /) only if |I fi J\ = q to a certain lift of X that lives in and has non zero 

entries I, J only when ID J = 0. The latter case is easier to handle and the idea of lifts helps reducing 
the norm computation for lifts of X to that of X. 


Definition 6.15 (Lifts of Matrices, Equation 8.5 in [MPW15]). For a matrix X £ rC-^C- 1 ;) for some 
0 ^ i ^ d such that X(T,J') = 0 whenever I' n /' t 0, define the lift X^ £ r(V) x (V) to be the matrix 
defined by: 


x (,} (/,/) 


fx(J \ (J n /), / \ (J n /)), ii\lnj\ = i 

lo, otherwise. 


The usefulness of the above definition is captured by the following claim: 
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Fact 6.16 (Lemma 8.4 in [MPW15]). Let X e RC^) x C-i) for some 0 ^ i ^ d such that X(I',J') = 0 
whenever V n /' t 0. Then, for the lift X® ofX, we have: 

" X< °" ^ (f) ' l|X|1 ' 


6.4.2 Graph-Theoretic Definitions and Lemmas 

In this section, we set up some notation and definitions helpful in our proofs of the main results of 
this section. The next few definitions and notation are generalization of the ones used in [DM15] to 
general degrees d and are useful in the proof of Lemma 6.11. 


Definition 6.17. Let U be a bipartite graph on vertices {1,2 ,d} x {V, 2 d'}. A U-ribbon of 
length 2 £ is a graph R on 2£d vertices 


/ f 

a v a d 


**\t • • • / u d' * 

b 1 b 1 b c b { 

V v ■ ■ ■ r v d ,..., v v u d . 


We install edges in R by placing a copy of U on vertices 1,2,... ,d and V, 2!,... ,d! (with the label i 
or i' matching the upper index of as and bs respectively) on a),..., af bf ] ,..., bf 1 for every i ^ d. 
For i - 0, we treat i - 1 as d (modular addition). Often we will omit the length parameter 2£ when it 
is clear from context. 


Definition 6.18. Let G be a graph. A labeled l/-ribbon R is a tuple ( R,F ) where R is a ll-ribbon and 
F : R —» G is a map labeling each vertex of R with a vertex in G. We require that for (it, v) an edge in 
R, F(u ) * F(v). 

Definition 6.19. Let ( R,F ) be a labeled Ll-ribbon where U has 2d vertices. We say (R, F) is disjoint if 
for every i, 

\{F(a ]),.. .,F{a1),F{b}),... ,F(fc?)}| = |{F(a] L _ 1 ), • • ■ ^U^jblf),... ,F(&f_ x )}| = 2d. 


Definition 6.20. Let (R,F) be a labeled Ll-ribbon where U has 2d vertices. We say that ( R,F ) is 
contributing if no element of the multiset {(F(u), F(v)) : (u, v) 6 R} occurs with odd multiplicity. 

The following combinatorial lemma will serve as a tool in the proofs of the main results for this 
section. 

Lemma 6.21. Let ( R,F ) be a contributing labeled U-ribbon of length 2£. Recall that R has vertex set a 1 ., b l . 
for i € £ and j e [d]. Let k ^ d. Suppose that the sets 

{F(a\),F(b l f)}i e [f],..., {F(a l k ),F(b t k )}i e \e\, L(b‘j)\ie\{:\,je[k+i,d} 

are disjoint. Then if U contains the edges {(1,1),..., ( k,k )} (where we identify the vertex set of U with 
[d] x [d]), {F(u) : u 6 R] has size at most (2d - k)£ + k. 
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Proof. The assumption on U implies that R contains the cycles 

Ci = f (a\,b\,...,b] r a\) 

C k d = (a\,b\,... ,b k (: ,a\). 

In order for ( R,F ) to be contributing, every edge (u, v) e R must have a partner (it', v') (u, v) so 

that F(u') = F(u) and F(v') = F(v). By our disjointness assumption, every edge in cycle C, must be 
partnered with another edge in C;. Thus, now temporarily identifying edges when they are labeled 
identically, each C, is a connected graph with at most l unique edges (since each of the 21 edges 
must be partnered). It thus has at most £ + 1 unique vertex labels. Among the cycles Ci,..., C/ c , 
there are thus at most k(£ + 1) unique vertex labels. In the rest of the ribbon R there can be at most 
2£(d - k) unique vertex labels, because once the cycles Ci,..., Q are removed there are only that 
many vertices left in R. So in total there are at most k(£ + 1) + 2£(d - k) = (2d - k)£ + k unique 
labels. □ 

The next few definitions and notation are needed in the proof of Lemma 6.12. 

Definition 6.22. Let U be a bipartite graph on vertices a\,a 2 , b\, fc> 2 - A fancy U-ribbon R of length 2£ 
is a graph on vertices Ci,.. .,C 2 (,aj,aj, ■ ■ bj, by, bj, b 2 r On the a and b vertices, R restricts 

to a U-ribbon of length 2£. Additionally, it has edges (ci,aj ), ( a, aj), (c;, bj ), (c„ bj). 

Where G is a graph, a labeled fancy U-ribbon is a tuple ( R,F ) where R is a fancy U-ribbon 
and F : R —> G labels each vertex of R with a vertex in G. We require for any edge (u, v) € R that 
F(u) * F(v). 

Lemma 6.23. Let U be a nonempty bipartite graph on vertices a\,a 2 ,b\,b 2 - Let (R,F ) be a contributing 
fancy U-ribbon of length 2£. Suppose that the sets 

{F(a}),F(b])}, {F(a 2 ),F(b 2 )}, {F(c,-)} 

are disjoint. Then {F(u) : u e R} contains at most 3£ + 2 distinct labels. IfU is empty, then {F(u) : u e R} 
contains at most 4T + 2 distinct labels. 

Proof. First suppose U is nonempty. By swapping a), a? or bj, b 2 or both as necessary (which does 
not change whether ( R,F) is contributing), we may assume that U contains the edge (a\, b \) and 
thus that R contains the edges (aj, bj) and (bj,aj +1 ) (where as usual addition is modulo £). 

Because (R, F) is contributing, every edge must have an identically-labeled partner. By our 
disjointness assumptions, edges among {a 2 , bjj may be partnered only to edges similarly among 
{aj, bjj. Also, edges between {c,} and { a ?, b 2 } may be partnered only to edges between {c z } and {a 2 , b 2 }. 
Thus, the 2f-edge-long cycle on vertices [a 1 ., id \ may have at most £ uniquely-labeled edges, and the 
4T-edge-long cycle on vertices [a 2 , b 2 , c,} may have at most 2£ uniquely-labeled edges. Since both 
are connected, the former may have at most £ + 1 unique vertex labels and the latter at most 2£ + 1 
unique vertex labels. Thus there are at most 3 £ + 2 unique vertex labels in ( R,F ). 

When U is empty the proof is similar: there are two paths, { aJ, bj, c,} and {a 2 , bj, C 2 }. □ 
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6.4.3 Proofs of Lemma 6.11 and Lemma 6.12 


Proof of Lemma 6.11. By Lemma 6.13 it is enough to prove the analogous claims for the n d x n d 
matrix Q with entries given by 


Q[(a 1/ ...,a d ),(b 1 ,...,b d )] 


{ T\(i,j)eB 9di,bj ^ I{®1/ • • • r^dr b\, ■ ■ ■, b d }\ — 2 d 

lo otherwise 


By multiplying Q by suitable permutation matrices P,P' to give PQP' , we may assume in the 
2-matching case above that the matching is {(1,1), (2,2)} and in the nonempty graph case that the 
edge contained is (1,1) (where we think of the vertex set of U as [d] x [d]). Note that ||Q|| = ||PQP'||. 

We apply Lemma 6.14 to obtain a family of matrices {Qf}ie[r] for some r = 0(3 3 log n) = 0(log n) 
satisfying Q = ]fiQi- On an Y entry (a\,... ,a d ), {b\,... ,b d ) on which Q ; - is nonzero it is equal 
to Q at that entry, and furthermore for each Q, there is a partition (S',..., S'f of [n] so that if 
Qi[(a lf ...,a d ),(b lf ...,b d )] * 0 then a 1 ,b 1 £ S\,a 2 ,b 2 £ S\, and ay, b, e S l 3 for all j > 2. 

We show that every matrix ||Q,|| has bounded spectral norm. To save on indices, let N = Qj. Let 
(Si, S 2 , S 3 ) be the partition of [n] corresponding to N. We bound E Tr (NN f Y for some £ to be chosen 
later. 

Let 'R(N) be the set of contributing disjoint labeled U-ribbons (R, F) of length 2£ with F(fl^), F(b') 6 
Si,F( 4),F(^) € S 2 and F(flj),F(&j) £ S 3 for j > 2. Then E Tr(NN + ) f ^ 0{£ { )\K{N)\. (Here we have an 
inequality rather than an equality because some elements of Ti(N) may correspond to entries of N 
which are zero because they appeared in some other part of the partitioning scheme and (f ^ £\ 
accounts for reorderings of the labels.) 

Supposing that B contains a 2-matching, by Lemma 6.21, each (R,F) £ 'R(N) contains at most 
(2d - 2)£ + 2 unique (F(u) : u £ R}. So there are at most n 2 C(d i )+2 e i emen ts of R(N). It follows by 
Markov's inequality that for any a > 0, 


P(||N|| > a) < F(Tr(NN T Y > a 2e ) ^ 


0(tf) n 2W-P+2 


a 


2 { 


Choosing a > O(/’)H rf_1+10 ^(logn) 1 ^ 2f 2 rf2 / 2 ^makesthisatmost(^n 10 log(n)2 rf2 ) _1 . Choose £ = (log/z) 2 
so that there is such an a also satisfying a = 0(n rf_1 log(n) 2 ) (so long as d C 0(log n) as assumed). 
Taking a union bound over the log n matrices Q„ we get that 

P(exists i with||Q,j| > 0(log(n) 2 n d_1 )) ^ n~ 10 2 _rfZ 


and so by the triangle inequality applied to ||M|| = || ff i Q/ll, we get 

P(||Q|| > 0(n d - { log(n)) 3 ) < ;r 10 2^ 2 . 


The case that B contains only a 1-matching is similar, replacing the (2d - 2)£ + 2 unique vertices 
in a contributing B-ribbon with (2d - 1 )£ + 1, again by Lemma 6.21. □ 

Proof of Lemma 6.12. We first handle the case when U is non empty. By Lemma 6.13 it is enough to 
prove the analogous statement for the n 2 X n 2 matrix, also by abuse of notation denoted Q, which is 
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the sum of matrices (abusing notation again) Q/ c with entries given by 


Q k [{a 1 ,a 1 ), (h,b 2 )] 


[9kfii0kfi29^,b\9k,b2^-(i,j)&u9ai,bj if 11^1/^2/ b\, b 2 }\ — 4 
lo otherwise 


By multiplication with an appropriate permutation matrix (which cannot change the spectral norm), 
we may assume that U contains the edge (1,1). We begin with 2 from Lemma 6.14, whose hypotheses 
are satisfied by our convention y a>a = 0. This gives a family Qi,..., Q r with r = 0(3 3 log n) = 0(log n) 
so that i Qi and a corresponding family of partitions (Sj, S 2 , T 1 ),..., (S', S r 2 , V). Lemma 6.14 
guarantees that Qi[(a 1/ a 2 ),(h / b 2 )] = LkeT Vk,a, yk Al yk,b, Vk,b 2 U {l ,j)eu WT, for some T c T 1 when 
a\, b \ e and a 2 , b 2 £ S 2 and is zero otherwise. 

Fix some i e [/'] and let N = Q, (to save on indices). We will bound E Tr(NN + ) / for some £ to be 
chosen later. Let (Si, S 2/ T) be the partition of [//] corresponding to N. 

Let 'R(N) be the set of contributing labeled fancy IZ-ribbons of length 2£ so that for each c, e R, 
F(ci) € T, for each aj, bj we have F(aJ),F(bJ) e Si, and for each a 2 jr b 2 we have F{a 2 ),F(b 2 ) 6 S 2 . 

Expanding ETr(NN + ) / as usual, we see that ETr(NN + ) f E ^|^(N)|. (As in the proof of 
Lemma 6.11, we have an inequality rather than an equality because some entries of N may not have 
a sum over all elements of T if there is overlap with previous parts of the partitioning scheme.) By 
Lemma 6.23, |7?(N)| ^ ( 3 ^+ 2 ) ^ n 3e+2 . 

By Markov's inequality, 

fCyRC+2 

P(||N|| > a) < P(Tr(NN" i y > a 2( ) ^ . 

a zt 

Taking a E £n 3 ^ 2+l3 ^ 2 guarantees that this is at most n -11 . If £ = ©(log n), then there is such an a 
satisfying also a = 0(n 3 ^ 2 log(n)). 

By a union bound and triangle inequality, we then get 

PflIQII > 0 (h 3/2 (log n) 2 ) < 0(n ~ 10 ). □ 


The proof in the case of U empty is similar, using Lemma 6.23 in the empty U case. 


7 Analyzing Deviations for the Degree-tf MPW Operator 

In this section, we use the tools developed in Section 6 to analyze the spectrum of the deviation 
matrix D = AT - E and prove Lemma 4.7. 

As noted in Section 7, we decompose AT = E + D. For any I,Je (^), D(f, /) depends on a) 
deg (I U /) and b) whether S ex t(I,J) £ G. If D(I,J) depended only on b) above, then it could be 
decomposed into a sum of patterned matrices defined in Section 6; analyzing these is tractable. Our 
first step is thus to get rid of the dependence on deg(I U /)—the only part depending on the entire 
graph. We will obtain a matrix L that depends only on whether S ex t(L /) 2 G or not (and thus is 
"locally random" in the sense of [MPW15]). 

Specifically, we write D = L + A where L is the locally random part obtained by replacing D(I,J ) 
by E[D(I,/) | S ex t(I,J) £ G] whenever & ex t(I,J) £ G and an appropriate negative constant when 
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&ext(LJ) j= G (this makes the expectation of each entry over G ~ G(n, |) to be 0). More concretely, 
following [MPW15], we define: 


Definition 7.1. 


a(i) 


( 2 d-j) (n - 2d + i 

( 2d 1 

V2 d-i> 




and p(i) = 2 ^ for each i. We set L(I,J) for every I,J e ( 'j ) to be 

L (i ]) = i a{{1 n Jl) ' ii8{1 u J) x m u £(/)) - G 

I -a(\I n /I), otherwise. 


We define A = D - L. 


(7.1) 


The idea behind the definition is that L(I,J ) = E c _ c(j) | £,„/(/,/) c G] whenever 

£ext(LJ) £ G and in the other case, chosen to make Ec[L(I, /)] = 0. We will analyze L and A 
separately. The proof of Lemma 4.7 is broken into two main pieces. Each piece analyzes the action 
of L and A split across various eigenspaces Vo, V\, ... ,V& of the matrix E. Such fine grained analysis 
for the case of d = 2 was done in [DM15]. A few points of distinctions from [DM15] are in order at 
this point. 

The first is regarding the high level approach. The approach of [DM15] used explicit expressions 
for a canonical set of eigenvectors in V\ to obtain similar conclusions as us for the case of d = 2. This 
approach gets unwieldy very quickly because the explicit entries of eigenvectors for V, for i > 1 are 
hard to work with [BI84], We tackle this issue by developing an argument that doesn't need explicit 
entries of the eigenvectors. Instead, we use basic representation theory (Section 6.3) to identify a set 
of symmetries satisfies by vectors in Vi for each i and use it obtain the conclusions we require. 

Second, [DM15] deal with the optimization version of the degree 4 SOS program which, as 
noted in the introduction, could be potentially weaker than the one we analyze here (and thus 
our lower bound is technically stronger). This simplifies the analysis in [DM15] a little bit as the 
matrix A defined above is identically zero for the operator analyzed. We explicitly work with the 
feasibility version of the degree 4 SOS program and thus, must deal with the additional complexity 
of handling A. It turns out that we have to do a fine grained analysis of the A matrix itself. The 
decomposition we use for A is somewhat different from the case of L even though, the analysis of 
each piece of the decomposition proceeds similar to the case of L. 

Third, for the special case of d = 2, essentially the only matrix one has to analyze is the Lo, 
the matrix obtained by zeroing out all entries (l, J) in L such that I (T / ^ 0: a uniform bound on 
spectral norm of the remaining component suffices. However, for higher d, one has to deal with the 
"non-disjoint" entries with some care and an argument analogous to the one in [DM15] fails to show 
PSDness of AT beyond co ~ mm giving no asymptotic improvement over [MPW15]. 

Finally, our argument for analyzing the spectral norms of each of the pieces also needs to 
be much more general than in case of [DM15] to handle higher degrees. For this, we identify a 
simple combinatorial structure (size of maximum matchings in appropriate bipartite graphs on 2d 
vertices) that controls the bounds and could also be used to obtain slick proofs of the conclusions 
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required in [DM15] in the context of analyzing AT for d = 2. Our combinatorial argument itself is a 
generalization of the one given by [DM15] for this case. 

We now go on to describe the two lemmas that encapsulate the technical heart of the proof of 
Lemma 4.7. The first does a fine grained analysis of spectrum of L. In the following, let AT be the 
filled-in matrix for the degree-rf MPW operator at clique size co (Definition 4.5), E = Eg~g(h,i/ 2 )[AT], 
D = AT - E, E be as Definition 7.1, IT, be the projectors to the spaces V, of Fact 4.9. 

Lemma 7.2 (Bounding Blocks of L). With probability at least 1 - \ over the draw ofG ~ G(n, |), each 
0 ^ i ^ d satisfies 

1 . 

|n,Ln ; | ^ 2 2d O(co 2d n d -i), 


2. If i, j ^ 2, then 


n,LUj\ < 2 2d O(o; 2d n d “ 1 ). 


The next lemma does a (even more) fine grained analysis of the spectrum of A: 

Lemma 7.3. With probability at least 1 - p over the draw ofG ~ G(n, \),for each 0 ^ i ^ d: 

|n,An ; | < 0(2 0(rf) cu 2d_min,i; V~2) + 6(l 0{d) (o 2d ^n d ^). 

We can now use Lemma 7.2 and Lemma 7.3 to complete the proof of Lemma 4.7. 

Proof of Lemma 4.7. For each 0 ^ i,j ^ d, we compute IT, AT ITy and use Lemma 3.4. We write 
AT = E + L + A. First, ntETIy = 0 whenever i 4 j as n z are projectors to eigenspaces of E. Let 
Ao, Ai ,... ,Ad be the eigenvalues of E on eigenspaces Vo, Vi,..., V#. From Lemma 4.10, we have: 

Aj^2-°^-n d -co 2d -i. 


Thus, 


n^En, ^ 2 - o(d2) n d 0 J ld ~’. 

In what follows, all our statements hold with probability at least 1 - 0(1)/m 
Using Lemma 7.2 and Lemma 7.3, for every i ^ 2, 

||II/(L + A)n z || < O(o> 2d n d_1 ) + Oico^r&i). 


On the other hand, when i ^ 2, 


||n,-(L + A)n,|| sC d{co 2d n d -l). 


Then, it is easy to check that for any co = 0(n d+i ), 

1LATIL = n,-(E + L + A)n,- ^ 


Next, we bound the cross terms |n;(E + A)IT , for i 4 j. Again, using Lemma 7.2 and Lemma 7.3, 
we have for i, j ^ 2: 
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\Ui(L + A)n ; | ^ 0(cu 2d n d_1 ) + 0(cu 2rf “ min| ^n d “5). 

For co K 0(n 3+r), it is again easy to check that, for i, j ^ 2, the above expression is at most 

2 

In the case when one of i, j is at most 1, we have the bound 

|n ; (L + A)n ; | ^ 0{co ld n d - l 2). 

In this case, it is easy to check that so long as co = 0(h3tt / poly log (n)), 

in,(L + A)n ; | ^ Oico 2 ^) ^ - d 

By an application of Lemma 3.4, the proof is complete. □ 

7.1 Proof of Lemma 7.2 

Proof Plan. We first describe the high level idea of the proof. 

We start by decomposing L = L (?= o Lq where L q (I, ]) = L(I,J) if |f n J\ = q and 0 otherwise. Notice 
that each Lq then is obtained by a scaling an appropriate 0/1 matrix. 

Most illuminating is the disjoint case Lq, which is nonzero only at entries I,J with I n / = 0. 
For any disjoint I,J, Lq(I, J) depends only whether S ex t(I, /) £ G, which, one could write as an 
appropriately scaled AND function of the indicators g e of edges e € S ex t(I, J). We can expand this 
AND function in the monomial (parities of subsets of g e variables) basis. Each such monomial 
corresponds to the bipartite graph B that contains the pairs e 6 S e xt(L J) that constitute the monomial. 
This gives a decomposition of Lq into 2 th - 1 (since the constant term is 0, L being zero mean) 
components, L® for each non empty, labeled bipartite graph on [rf] X [d]. 

We can bound the spectral norm of each of the pieces L® by direct application of tools derived 
in Section 6.4. The main work in this section goes into showing that depending upon the structure 
of B, an appropriate selection of subspaces Vj lie in left or right kernels of L®. Thus, for a fixed term 
LliLLlj, some L® do not contribute. We identify the maximum spectral norm among contributing 
terms to obtain the final bound. 

To accomplish this goal, we rely heavily on the tools built in Section 6.3 which give us a handle on 
the symmetries of the eigenspaces Vo, V\,... , V d . This requires some work based on representation 
theory of finite groups and is presented in Lemma 6.4 and Lemma 6.9. 

The case of Lq for q ± 0 needs even finer decomposition. We decompose each Lq (q > 0) into 
matrices that identify the "pattern" of the q intersecting vertices. In [MPW15] a similar idea is used 
to reduce the task of bounding the spectral norm of Lq to a calculation similar to one in the case of 
Lq. However, unlike [MPW15], we also require properties of the kernels of the components of the 
decomposition. After restricting to a fixed intersection pattern of q vertices, we thus resort to using 
a generalization of the kernel analysis used for the Lo case. We now proceed with the proof plan as 
described beginning with the decomposition of each Lq. 
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7.1.1 Decomposing L 


We start by decomposing L further as L = Ln = o ^<7 where for any I, J e 


L q - 


\L(I,J) if 1 / n /I = 

0 otherwise. 



Decomposing Lo. Recall that £ is the set of all bipartite labeled graphs with left and right vertex 
sets labeled by [cf] and [d]. Recall also from Section 6.3 that for any I,J c [n] with |/| = |/| = d, the 
graph C i,j(B) is a copy of B on vertex sets I, J where the correspondence between I and [d] and / and 
[d] is determined by the sorting map C. Finally, recall that for a graph G on [n], we let g b be the +1 
indicator for the presence of edge b in G, and by convention g b = 0 when b = (z, i) for any i £ [/;]. 
For any B £ B, define an (^) X (^) matrix 

~L b 0 (I,J) = a(0) • (2 rf2 - l)n beClj(B) g b . 

The idea is to write Lo as a sum of such matrices L® with the entries corresponding I, J where 
|f n J\ ± 0 zeroed out. Thus, define L B to be the matrix with (I, J) entry given by 

T B (J n = ! L *<?'» i f If n /I = 0 
0 |o otherwise. 

We think of Lq as a rescaling and centering of a 0/1 matrix whose entries are the AND of the 
+1 indicators for the edges in & ex t(I,J )• Decomposing these ANDs into monomials over those 
+1 indicators, we see that each monomial corresponds exactly to one bipartite graph B, and the 
centering of Lo corresponds to removing the constant monomial, which corresponds to the empty 
bipartite graph. Every other monomial recieves equal weight 2~ d ~ in this expansion, and so from 
these observations it becomes routine to verify that 

T* Tj L ‘= L »■ 

BeS 


Decomposing Lq. Similarly, we further decompose Lq for q > 0. Here things are a bit more 
involved. Let us motivate our decomposition by understanding the structure of the matrix Lq for 
q > 0 a little bit. Consider an entry ( I,J) such that I n / = K. Then, S ex t(I, J ) C £ ex t (I \K,J\ K). Thus, 
the edge structure in the bipartite subgraph on vertex sets I \ K and J\K decides the value of L(I,J) 
for any graph G and we can hope to a get a patterned matrix. We now follow this intuition. 

Recall the sorting maps C i ■ [d] —»I and Cj ■ [d] —> J■ Letting Zf, Z r c [d] be subsets of size q, we 
define L„ f ' Zr such that: 


T Z£,Z r 


(I,J) = 



if c i(Z{) = Qj(Z r ) 
otherwise. 


That is, Lq ' 1 ' is the "part" of Lq where any I,] intersect in a (size q) subset given by Li(Zf) and 

z z z z 

C j(Z r ). It is then easy to see that L q = L,z e ,z r h ' • Next, we decompose each L (/ c ' ' further based on 
non-empty labeled bipartite graphs B € Bz ( ,z r for each Zf, Z r . 
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We now define a matrix which is nonzero only on entries which intersect in at least q places: 


7 B,Z(,Z r 

W 


(U) = 


ja(q ) • ( 2^) 2 

1° 


l)n beC. If] (B)9b 


if Q,{Zf) = C/(Z r ) 
otherwise. 


Again, as before, the actual decomposition needs to zero out the entries ( I,J ) such that |f n J\ ± q. 
Thus, we define L q ' c ’ ' by zeroing entries of L q ’ ’’ ' which intersect also outside of Zf, Z,,: 

r B,Z(,z r /j n = K ,Zc,Zr ' if ICi([rf] \ z,) n C/(M \ z r )| = o 

q ' jo, otherwise. 

Finally, it is again easy to verify that: 

L q = 2-W 2 Yj Tj L V e ' Zr - 

Z(,Z r c,[d\ BeSz e ,z r 


7.1.2 Spectral Analysis of L 

In order to prove Lemma 7.2, we will first use the decomposition described in the previous section 
to write L q as a sum of appropriate patterned matrices. We will then partition the sum into groups, 
each group corresponding to an equivalence class of (left or right) similar bipartite graphs B. We 
will infer some properties about the kernel and finally use the spectral norm bounds from Section 6.4 
to complete the proof. 

More concretely, let (B, Zf, Z r ) be a q pattern (as defined in Section 6.3). From our decomposition 
from the previous section, we have: 

L q =2- (d -'? )Z y L L V e,Zr < 7 - 2 ) 

Z f/ Z r BeSz e ,z r 

g z ^ 

Our idea is to analyze appropriate collections of L q ' f ' r separately. When q = 0, Zf, Z r are redundant 
(being 0) and thus L®' Zf,Zr = L® in that special case. In the first step, we observe that L®' Zf,Z| has 

^ g Z 2 

some symmetries that are helpful to us. We thus want to deal with the sums of L q ' c ' ' instead of 

gZ z 

L q ' ° r . To justify this, we start by showing that the difference of the above two matrices has small 
norm. 

Claim 7.4. For any (B, Zf, Z r ), 

|| L B q ' Zf ' Zr - L q ' Zt ' Zr || ^ 0(coM-in*- 1 ). 

~ B Z Z 

Next, we bound each | \L q ' " ' || using the machinery from Section 6.4. 

Claim 7.5 (Norm Bounds on Pieces). With probability at least 1 - 1/n 10 over the draw of G ~ G(n, j). 


2. If \B e \,\Br\> 2, then: 


\\t 


B,Z(,Z r 


O(co 


2d-q . n d -1 


L ). 


At first, it should be worrisome that some of the L 


B,Zf,Z r 


have norms that are much larger than 
what we need (in the second claim of Lemma 7.2). What comes to our rescue is the fact that the 
components p®' Zf ' Zr that have large norm do not contribute to quadratic forms on |l 11 LTd1 when i, j 
are at least 2. The crucial observation that allows us to conclude this is based on the observation that 
L q ' r are patterned matrices in the sense of Definition 6.8 and thus clubbing all (8/, Z' f , Z' r ) that are 
(left or right) similar to (B, Zf, Z r ), we can show that certain V 1 lie in their kernels. More specifically: 

Claim 7.6. For t > q + \Bc\, 


Similarly, for any iv > q + \B r 


n; 


E 


* B',Z'Z' r 
L q 


(B ,Z'.,Z' r )~ i(B ,Z( ,Z r ) 


= 0 . 


E 


? B,Z(,Z r 


{B’ ,Z’Z' r )~ r (B ,Z( ,Z r ) 


n r „ = o. 


Before proving the three claims above, we show how they imply Lemma 7.2: 
Proof. We use (7.2) to write: 

( 


L q = 2- (d -i )2 


Y Lf^ + £ 

K B,Z(,Z r B,Zf,Z r 


(7.3) 


L q = 




Y if ^ + E (if" 2 ' - if- 2 ') 

J$,Zi,Z r B,Z{,Z r 


Y nif " 2 

^B,Zi,Z r 

Using Claim 7.5 


+ 


2 ~(d~q) 2 


Y nd ' 2 " 2 

B,Z(,Z r 


r B,Zt,Z r 


^ 2 ld • O(co 


2^ qy.d 2 


)• 


For the second part, fix an i ^ 2. We first show that some terms in the decomposition in (7.3) do 
not contribute to |ntL ? rii|. 

Consider any bipartite graph B such that |Bd < 2. Then, we have from Claim 7.6, 

Tj(B'.Z' Z' 


r B 

{B',Z'Z' r )~c(B,Z t ,Z r ) L q 


' z r z ' )n f = o for every t ^ 2. Thus, |ITt Eg. 


T B,Z[,Z r y r | 
\B ( \<2 L q 11 il 


0 for any i 

and every t ^ 2. Similarly, when |B,| f 2, I P L q ' Af ' Ar V\j\ = 0. On the other hand, when both 
\Bf\ABr\ f 2, from Claim 7.5, we have (with high probability over the draw of C ~ G{n, \)), 
\\L B q ' Ze ' Zr \\d(a) 2d -i ■ n d ~ x ). Thus: 
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Thus, for i ^ 2, we have: 


lint^ryi = 2 ~^ 2 • u £ u]L B q ' Zl ' Zr m 


B,Z e ,Z r 


^ 2 -( d - q f y |n+Lj' Z< ' Zr n ; | + 2“ (d "'? )2 ■ ^ | L B,Z,,Z r _ 
B,Zi,Z r B,Zi,Z r 

- 2-(.d-q? y 


|n+L®' Zf ' Zr n ; | + 0(co 2d ■ n d ~ x ) 


B:\B(\^2\B r \^2,Z { ,Z r 


^ Q(co 2d q ■ n d 5). 

Similarly, for i + j ^ 2 we must have z, / ^ 2. Thus, by a calculation similar to above, ||l l d L q Tlj\\ ^ 
O(co 2d ■ n d ~^). □ 

In the remaining part of this section, we complete the proofs of Claims 7.6 and 7.5. 


7.1.3 Proof of Claims 


In this section, we obtain quick proofs of the three claims above using the tools developed in 
Section 6. 

We first prove Claim 7.4. 

Proof of Claim 7.4. The proof is by appealing to Fact 3.3. Observe that 


l B q ' Z( ' Zr {I,J)-L B q ' Ze ' Zr 


(LJ) I ^ 


jo,if \mj\^q, 

|a(zj) • (2^ d ~^ 2 - l) 2 ~( rf W 2 ' otherwise. 


We now estimate 


max V | L B ' Z ° Zr (I,J) - l B ' Z{ ' Zr (l,J )| ^ 2 d n d - q ~ 1 ■ a(q) ■ (2 {d ~ q)2 - l)2“ (d “'? )Z . 

/'W'\ i H 1 

{d) H ln D 

The claim now follows from Fact 3.3. 


□ 


The next is a direct application of Lemma 6.9. 

~ B Z Z 

Proof of Claim 7.6. The main observation is that L q ' h r is a patterned matrix (with a (/-pattern 
(B,Z{,Z r )) in the sense of Definition 6.8. The result then follows immediately by appealing to 
Lemma 6.9. □ 


Finally, we prove Claim 7.5 using Lemma 6.11. 

Proof of Claim 7.5. First, consider the case of L® (Zf = Z r = 0 in this case). We write ||L®|| ^ 
||L®|| + ||L® - L®||. For the second term, we can appeal to Claim 7.4. For the first term, observe that 
by a direct application of Lemma 6.11, ||L®|| C a(0)(2‘ f ~ - l)2 -rf ~ • 0(n d ~i). Further, when \Bf\, \B r \ ^ 2, 
then, B has a 2-matching and thus, by another application of Lemma 6.11, we obtain that in this 
case, ||L®|| ^ a(0) • (2 rf ~ - l)2 -lf ~ • 0(n d ~ L )- The proof is thus complete for the case of q = 0. 
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We now reduce the computation for the more general case to similar calculations by appealing 

p(n)x(j-l) 


to the idea of lifts. Consider the matrix R e 


given by 


w',n 


\i B q ’ Ze ’ Zr (r uk,j'u k) if v n y = 0 . 

10 , otherwise. 


where K c [ n] is some fixed subset of size q such that K n I' = K n /' = 0. 


Then, L B ' e ’ r = R^\ Thus, using Fact 6.16, \\L B ' e ' ' || C 2 2 rf ||R||. Since R has non zero entries only 
when the row and column indices are disjoint sets, we can apply Lemma 6.11 to R to obtain ||R|| C 
a(q)(2 ( - d ~^ 2 -l)2~^ d ~^ 2 Further, when \B/\,\B,\ ^ 2, again by an application of Lemma 6.11, 


we 


have:||R|| ^ a(q)(2^ 2 -1)2~^ 2 -Oin^- 1 ). Now, using \\L B ' Z{ ' Zr \\ ^ \\L B ' Z{ ’ Zr \\+\\L B ' Z{ ' Zr -L] 


B,Zi,Z r 


and using Claim 7.4 completes the proof. 


□ 


7.2 Proof of Lemma 7.3 


We now move on to analyzing the spectrum of the matrix A. 

The high level plan of the proof is similar to that of Lemma 7.2. We define A,- for each 0 ^ i ^ d 
as follows: 

(A(f, /), if |J n J| = <7 

J) ~ \ 1 

10, otherwise. 

We further split A q = Ex : |X|=q where: 




|A(j(/, /), iff n / = x 

10 , otherwise. 


First, we observe that Ao = 0. This is because deg(f U /) when I and / are disjoint is exactly 1 and 
doesn't depend on the graph. Thus, A = 1 A,. As before we would like to spot patterned matrices 

in each A, to show that appropriate eigenspaces Vj lie in the kernel of A,. In case of A q , however, 
there's a difference how this needs to be done. This is because each entry (f, /) of A, potentially 
depends on the edges from every vertex in the graph G to I and /. This is unlike the case of L where 
the (f, /) entry depends only on the edges between I and / (in fact that's the reason we separated 
L from A in the analysis). Nevertheless, we give a decomposition below that will help us make 
claims similar to the ones in the case of analyzing L in this case too. 

Let us first explain the main idea in the decomposition. The entry A q (I, J) depends on two events: 
a) whether S ext (I,J) C G and b) the number of subsets S c [n] of size |S| = q that form 2rf-cliques 
with I U J. The main observation that motivates our decomposition is the following: in the event 
that &ext(LJ) £ G, the deviation in deg (I U /) is completely captured (up to low order terms) by just 
the number of vertices s that has an edge to all of I U / in G. This allows us to write entries of A q as 
a sum of contribution to the deviation due to each vertex s separately. For the case of q = 1, this 
argument is in fact exact and there are no low order terms. When q > 1, the contributions due to 
individual vertices contribute the bulk of the deviation and only low order terms remain. 

From here on, we are in a situation similar to the one encountered in analyzing L in the previous 
subsection. We show that the components in the decomposition with large spectral norm do not 
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contribute to quadratic forms over eigenspaces with small eigenvalues of the expectation matrix E 
using the idea of patterned matrices from Section 6.3. We show that the remaining components 
have small spectral norm using the combinatorial techniques combined with the trace moment 
method developed in Section 6.4. 

We now proceed to make the ideas above more precise. We begin by some notation and a 
definition. We first define e l s K £ ) for any s £ [n] and K C [n], \K\ = q as follows: 


0 , if I Dj ^ K or s £ f U / 

e l,K (J/7) = ' - 1 otherwise and if S ex t({s}, / \ /) c G 

-1, otherwise. 


Similarly, we define e 2 € m( i ) for any s £ [n] and K c [n], |K| = q: 


<k(U) = 


0 , HI C\ J ^ K or s e IU J 

- l otherwise and if S e xt({s], / \ I) c G 
-1, otherwise. 


Next, we define: e S/ K € for any s € [n] and K c [//] satisfying \K\ = q: 


, if InJ^K or selUj 


^(1,/) = 121 — 1 otherwise and if £ ex t({s},K) c G 


-1, otherwise. 


Finally, we define 


e¥(I,J) = 


o ,if inj^K 

2 I4+I/H - i otherwise and if Sext(L /) £ G 
—1, otherwise. 


Using the matrices above, we can show the following approximate factorization for the entries of 

A q,K- 


Lemma 7.7. For every I, J such that |f n J\ = K, 


(*-„) 


1 + <W-/» £ (d + < K )( 1 + 4)(1 + < K ) - l) 

V2 d-q' se[n] 


< 2° {d) ■ 0{(o 2d ~l ■ ni- 1 ), 


fort] 


'i-2IIun+f |IU ? +1 f-f^') (n~\I\) 2d 111 1 
z (2d-|I|-l)! • 


Proof of Lemma 7.7. Let Ajuj be the set of vertices s in G not in I U / so that (s, i) £ G for all i £ IU J. 
By definition, if I U / is a clique. 


Yj (! + eliV’ /)) ((! + e h c)(! + <ic)(! + <*) - l) = 2 2|:u/l (|A;| - 2-^(n - |J U /|)). 

se[i;] 
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Applying the scaling 77 , we get 


77(1 + e 1,2 (h /)) Y, ((! + 4)(1 + <k)(1 + 4) “ 1 ) = (l^l - 

se[n] 


n - |J|\ 2 ( W 2 + 1 )-( 2 2 )(n - |/|) 2 d-m - 1 


H-9 5 ) 


2 ih / (2d - |1| - 1)! 

and hence the lemma follows from Theorem 9.4. 

We will need another definition before proceeding: For each i, 

j d = f Y e l 

K s,q e s ,K 

K:\K\=q 

As in the case of analyzing L, we define the filled in versions of e'_ as follows: 


J) d = • 2 ; 1-1 otherwise and if S ex t({s},I) c G 
-1, otherwise. 


s 2 sK {h J) d = f - 2 H 1-1 otherwise and if £W({s},/) £ G 
- 1 , otherwise. 


, if K I n J or sell)] 


, if K I n J or s £ 1 U / 


□ 


0 , if K<£lP\J or selUj 

e] K (I,J) C =* • 23 - 1 otherwise and if £ ex t({s},K) c G 
- 1 , otherwise. 

We start by giving norm bounds on all the matrices involved in the decomposition in Lemma 7.7. 
Lemma 7.8. 1. For each i £ {1,2,3}, 


E 


%K 


se[n],K:|K|=ij 


4 0(2 0{d) -n rf “5). 


2. For each i £ {1,2,3}, 


E 

se[u],K:|K]=(7 


(F — e‘ 

4 ,K 4 ,K 


4 0(2 0(l<) • n d ~ 3/2 ). 


E < 7 + 7 )> o E a+ 4 ) ° (i+ 4 ) ° a+ 4 > - E 

K:\K\=q se[»i] V 


3 1 


's,K 


i =1 


4 0 ( 2 ° (d) ■ n d - x ). 
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Proof. Fix a q and let Q be any matrix in ] 


^■[n]\ /[n]\ 

i / I a ) that appears in the statement of the lemma above. 


LetR e 


be defined by 


R(LJ) 


| 2^se[n],K:\K\=q 

10 , otherwise. 


=o Q(I UKJU K), iff n / = 0 


Then, Yj S £\n\,K:\K\=q Q = ^ ' n the sense of Definition 6.15. Thus, by Fact 6.16 ||R^|| ^ 2 2rf ||R||. Thus, 
we focus on bounding ||R|| in the following. We will use the trace moment method for this purpose 
and the argument is similar to the ones made in Section 6.4 to develop the general purpose spectral 
concentration results. For this reason, we will be a bit more terse than in the case of the other 
applications of the trace moment method before. We set up the notation for the general R as above 
and specialize the combinatorial reasoning for each of the specific matrices involved later. 

We expand 

E[Tr((RR + /] = E[ L L R(h \ Ki,h \ K 2 )R(I 3 \ K 3 ,I 2 \ K 2 ) 

Ki,K. 2,—,K21 I\,l2,—,h{, s \,S2,—,S2t 

■ ■ ■ R(I 2 {-1 \ K 2 (-\,l 2 e \ K 2 e) ■ R(he \ R 2 (,h \ Ki)L (7.4) 


We now investigate when does a term in the expansion above contribute a non-zero value to the 
LHS. 

First consider the case of Q = Y,s,k e \ /<■• Fix 2 = 1 , the other cases are similar, K (J, [) is a function 
of the variables gq for b e S e xt({s},I) (whenever I n / = K). Writing K (I, I) as a polynomial in gq 
for b e & e u({s},l) we observe that: E c [y| (< ] = 0 and that all coefficients of degree / polynomials are 
equal for every j and at most 2 d . We decompose the matrix e^ K so that for each (/, /) we only pick 
one of the (corresponding) terms in the polynomial expansion of each entry in t?/, described above. 

In the expansion of the expected trace above, then, for any such matrix that appears in the 
decomposition, each term is a (scaled) product of tji, variables for some b. For the expectation of 
such a term to be non zero, each gq must occur an even number of times. Consider the case of a 
matrix in the decomposition of ej K with entries being some (corresponding) monomials of degree 1 
for concreteness. The case of other matrices is similar. Fix any term. Let T be the set of all vertices 
that appear in some !, for i ^ 2C and are part of some b for gq that appears in the term. Then, by 
a random partitioning argument based on 2 , we can first assume that all {si,S 2 ,.. ■ ,s 2 (] doesn't 
intersect T (and lose a logarithmic factor in the spectral norm upper bound). It is now immediate 
that every s € {si,S 2 ,.. .,s 2 ^j for a term to have non zero expectation (otherwise, some gq will not 
appear twice in the product sequence describing the term). Thus, the number of distinct vertices in 
any term with a non zero expectation is l + (d - 1 )2t = (d - \)(2C). The number of possible terms 
with the same set of distinct vertices is at most (26)!. Finally, each term contributes at most 2 d . Thus, 
we can upper bound the expected trace of such a matrix by 2°^ • (26)! • n d ~2 poly log (n). We now do 
the standard step of using the Markov's inequality to obtain an upper estimate on Tr((QQ + ) 2f ) that 
holds with probability 1 - 1 /n, take (2£) til root and finally use 6 = 0 (log (n) to obtain the desired 
bound. Finally, matrix in the decomposition based on polynomial expansion in gq of the entries of 
e 1 K can be similarly upper bounded in spectral norm completing the analysis of this case by an 
application of the triangle inequality. 
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The case of the matrix L s e[ w ],K:|K|=ij e ‘ s k ~ K k s i m il ar / except that it is now a (q + l) th lift of some 
matrix. Repeating the reasoning as above, (except each entry (I, J) being described by sets of size 
d-2 instead of d - 1), we obtain the stated upper bounded in the statement of the lemma. 

Finally, we now proceed to the analysis for the third case. We first split the Hadamard product 
into a sum and analyze each term separately. We sketch the main difference from above for the 
combinatorial picture for the term Q = L S/ k Ws k here. The other arguments are similar. We 
again write the E[Tr((QQ + ) f )] as a sum over K\,.. .,K 2 £,Si,S 2 , ... ,S2 ( and f,h, ■ ■ ■ as above. By 
expanding out each of e^ K and e 2 K as polynomials in appropriate gj, variables, we observe that 
the least degree of any term in the expansion is at least 2 (i.e. involves a product of at least 2 g^s). 
Decomposing the matrix so that each entry gets the corresponding monomial in the polynomial 
expansion in terms of gj ,s as above, we now consider the matrix with the term involving a scaled 
product of exactly 2 g^s. The other matrices in the decomposition can be handled similarly. By a 
random partitioning argument (2) as before, we can assume that {si,S2 ,... S 2 e\ are disjoint from T, 
the subset of vertices from I, for i ^ 2C that appear in b for some gi, and lose a 0(log (n)) factor in 
the estimate on the norm. Again reasoning as before that for a non zero expectation, the term must 
have each gi, appear an even number of times. Since each gi, must appear at least twice (whenever 
it appears at least once), the number of distinct gj ,s that appear in a term that contributes non zero 
expectation is at most It. On the other hand, we now observe that the {si,S2 ,..., s^} U T are all 
connected via a path using b from gj, that appear in the term and thus, the number of distinct 
elements in {si,S2, ■ • ■, S 2 t\ U T are at most 2C + \. Thus, a non zero contributing term has a total of at 
most 2£(d - 2) + 2 £ - 21 (d - 1). Arguing in the standard way as done above, this now yields a norm 
estimate of 0(2°^ ■ n d ~ l ) as required. 

□ 


Next, we show find out the spaces contribute to. Towards this, for each i, we define : 


def 


L * 

K:\K\=q 


Then, we have: 

Lemma 7.9. For any t, t' > q, we have: 


1 . 

n t el q = 0 , 

2 . 

el q U t = 0 , 

3. 

= o. 

Proof. We only do the proof for the first case, the others are similar. The idea is again to use 
Lemma 6.4. Let v 6 M^). We will show that e 2 ^ ■ v = w such that there exists a vector such 
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that for every I, zvj = Lpcj \i'\=q u r- An application of Lemma 6.4 then completes the proof of part 

( 1 ). 

We have: 


m = Yj 

H 1 ’?) 

= L L c t/< (/ ' iy°i 

K:\K\=q /e(W) 

= Tj Tj 

K:KCI, \K\=q j 6 (M) 


The proof now follows by observing that ^ K (J, J)vj depends only on K. Thus, we set Uk 


for every \K\ = q by Uk = L 


H ll ‘ ] ) e s,K 


C (I, J)vj for any I such that K c I. 


□ 


We can now complete the proof of Lemma 7.3 using Lemma 7.8and Lemma 7.9. 

Proof of Lemma 7.3. For each q, let A Cj be the expression given by Lemma 7.7 to approximate the 
entries of Aq. Then, we have for any i, j: 


TliAUj = 


L 

q=l 


E 

n=1 


ri/A^Tly 

FI i i^Aq — Aq^j n ; + 


d 

T. ^iAqLlj 


q=l 


By a simple application of Fact 3.3, it is easy to observe that ||- A q || ^ 2 °Wo(co 2d -l ■ n d_1 ). 
An application of Lemma 7.9 and Lemma 7.8 yields that the terms that contribute to n,A/LIy have 
norm at most 2°^0(co 2d ~ imn ^ t '^n cl ~i) + 2 0 ^0(co ld ~ c tn d ~ 1 ). This completes the proof. 

□ 


8 Analyzing Deviations for the Corrected Degree-4 Operator 

The goal of this section is to prove Lemma 5.6, that is, to show that N' > 0. The proof is organized 
into 5 main claims that we next present. 

We first show that it is enough to prove PSDness of a somewhat simplified matrix N. N is 
produced by two simplifications to N'. First, to take care of the zero rows as in Section 7, we work 
with a matrix where we "fill in" the entries carefully. Second, N and AT are equal on all entries (/, /) 
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such that \I U J\ ^ 3; in other words, the correction affects only the homogeneous degree 4 parts. 
Specifically, let % 6 m( [ 2 I ) x ( I 2 I ) be defined so that: 


con A _ • Lseini n« e iu/r s (fl) if \I U J\ = 4 and S ext (I,J) c G 

J ) ~ 1 1 

10 otherwise. 

(Recall that C 4 is the number of 4-cliques in G.) We then set 

N = AT +R, 


where AT is the filled in MPW matrix (Definition 4.5). 

Our first claim shows that it is enough to prove PSDness of N: 

Lemma 8.1. For any y, c there is cuo = Q( yjn/log(n)cy) so that for any co ^ coq with probability 1 - 0(h _1 °) 
if N > co 2 n 2 /c ■ I then N' > 0. 

Next, we decompose the matrix N appropriately and study the spectrum of each of the pieces. 
Towards this, we define fio 6 m( I 2 1 ) x ( I 2 1 ) as follows. For every I, J: 

W,J) = Y6 ' n «6(iu/)\(/n/)?'s(«) 

se[»] 

Recall that from Definition 7.1, we know that AT = E + L + A. By writing A = Aq + (R - Kf), we 
obtain the decomposition: 

N = (E + ‘R 0 ) + L + A + ( ( R- < Ro). (8.1) 

In what follows, we will analyze each piece of the decomposition above separately on a carefully 
constructed decomposition of M^z 1 ). We now proceed and construct this decomposition. Recall 
m( [ 2 ! ) = Wo © V] © V 2 where Vq, V\, V 2 are the eigenspaces of the matrix E = E[ AT] = E[yV] from 
Fact 4.9 and Lemma 4.10. Let r s € M" be as described in Definition 5.3. With slight abuse of notation, 
we write rf 2 for the vector in m( I 2 1 ) such that for every I 6 (^), 

rf 2 (I) = n ieI r s (i). 

We now define a new decomposition by splitting V 2 further and write M^z 1 ) = Wo© Wi © W 1 . 5 © W 2 , 
where: 


W 0 = V 0r 
W 1 = V 1/ 

W 1.5 s.t. W 1.5 ± (Wo © Wi) and Wo © Wi © W 2 = Vq © Fi © Span{rf 2 } 

W 2 = (Wi © W 2 © W 3 ) x . (8.2) 

Let n w „ be the projector to W fl for every a e {0,1,1.5,2}. 

We are now ready to analyze the spectrum of each piece from (8.1). First, we analyze the 
spectrum of (E + !Rq) : 
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Lemma 8.2. For every y there is coo = Q( yjyh/ log(n) 2 ) so that for co ^ coo, with probability 1 - 0(n -10 ), 

(.E + R 0 ) > Q(co 4 h 2 ) ■ n Wo + Q (co 3 n 2 ) • + Q (yco 5 n) ■ nw 15 + O(co 2 n 2 ) • % 2 . 

Next, we analyze the spectrum of L. Here at last we see the main technical improvement in 
these corrected moments—the cross term between V\ and V 2 has become a cross term between W\ 
and W 2 and has much-reduced norm. 

Lemma 8.3. For any co = 0( V”) there is p = polylog n so that, with probability 1 - 0(n~ w ) the following 
bounds hold. 

1. Diagonal Terms: 

linw.Lnw.il < 0(pco 4 n 3/2 ) for a e { 0 , 1 ,1.5} 

||n W 2 Ln W2 || ^ O(pco 4 n). 

2. Off-Diagonal Terms: 

||n w „Ln W( ,|| < 0(pco 4 n 3/2 ) for a,be {0,1,1.5,21 
IPw.LnwJI < O(poo 3 n 3 ' 2 ) for a e {1,1.5}. 

Next, we bound the spectral norm of A. This is a direct corollary of the more general bound in 
Lemma 7.3, but to have a self-contained proof of the degree-4 case we also give a proof later in this 
section. 

Lemma 8.4. Let AT be the (£) x ff) filled-in matrix for the degree-4 MPW moments with clique size co 
(Definition 4.5). Let A be as in Definition 7.1. With probability 1 - 0(n~ u> ), ||A|| ^ O(co 3 n 3 ^ 2 log(/z) 2 ). 

Finally, we bound the spectral norm of the last piece (fR - Rq): 

Lemma 8.5. Let G ~ G(n, 1/2). With probability 1 - 0(n _1 °), \\R - 7?oll ^ Ofyco 5 n 1 ^ 2 log(n) 2 ). 

The proofs of these lemmas follow, but first we complete the proof of Lemma 5.6 and hence of 
Theorem 5.4. 

Proof of Lemma 5.6. By Lemma 8.1, it will be enough to exhibit c,y = polylog n and co\ = 
Q( yjn/ log (n)cy) so that N > ( co 2 n 2 /c ) • I with probability 1 - 0(n _1 °) when co ^ oo\. Then 
our final bound will be given by the minimum of co\ and coq of Lemma 8.1. (Recall that y is a 
parameter inside N.) In the following, all that we claim happens with probability at least 1 - n~ 9 by 
a union bound. 

So let coo e M; we will choose it later. We will find c, y and conditions on cuo so that the conditions 
of Lemma 3.4 hold for N - (co 2 n 2 /c) ■ I. 

First of all, by Lemma 8.5, for every y there is c = O(min{n/co 3 y, 1}) so that 

E - (co 2 n 2 /c) ■ I > Cl(co 4 n 2 ) ■ + Q(cu 3 n 2 ) • + Q (yco 5 n) ■ nw 15 + Q(cu 2 h 2 ) • nw 2 ■ 

We assume c = c(y) is chosen in this way. 
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For any y, by Lemma 8.5 if we choose co ^ ^Jn/y log(n ) 2 then \\R - 7?o|| ^ o(co 2 n 2 ). Adding 
R-Roto the previous equation, 

(£ + R 0 ) + {R- R 0 ) - (co 2 n 2 /c) ■ I > Q(<u 4 n 2 ) • L[w 0 + O (co 3 n 2 ) ■ + Q (yco 5 n) ■ 5 + Q (co 2 n 2 ) ■ FIw 2 . 

By the same reasoning using Lemma 8.4 to add A to the previous equation, we have for the same 
choice of co: 

N -L - (co 2 n 2 /c) ■ I > Q (co 4 n 2 ) ■ riw 0 + Q (co 3 n 2 ) ■ + Q (yco 5 n) ■ Liw 15 + O.(co 2 n 2 ) ■ flw 2 • (8.3) 

So it just remains to add L to the left-hand side. 

We decompose L as: 

l = (n Wo + n Wl + n Wl5 )L(n Wo + n Wl + n WL5 ) + n W2 L + Lnw,. 

Let p be as in Lemma 8.3, which implies that 

(riw 0 + n Wl + n Wl 5 )L(n Wo + n Wl + n Wl5 ) < o(pcu 4 H 3 / 2 )(n Wo + n Wl + n Wl5 ) • (8.4) 

Choosing y = 0(p 2 log(n)) we get that this is o( -y/cebz 2 ■ yco 5 n). So using Lemma 3.5 to add (8.3) and 
(8.4), we obtain 

N-H-w 2 L-LHw 2 -(oo 2 n 2 / c)-I > 0{coSi 2 )-I\^ 0 +D.(co i n 2 )-Ilw l +Q(yo; 5 n)-nw 1 . 5 +0((U 2 H 2 )-nw 2 . (8.5) 
We break ITw 2 L apart as 

n Wz L = n W2 Ln Wo + nw 2 Lnw 2 + nw 2 L(rivv 1 + riw ig ). 

By Lemma 8.3, 

lin^LLlwJI ^ 0 (p<u 4 n 3 / 2 ) = o( Vcu 4 h 2 • co 2 n 2 ) for co ^ yfn/plogfi) 

||n W2 Ln W2 || ^ 0(p<u 4 n) = o{co 2 n 2 ) for co ^ yfn/p\og(7i) 

||n W2 L(n Wl + n Wl 5 )|| ^ O(pco 3 n 3/2 ) = o( Jcohi 2 ■ yco 5 n) for y ^ p 2 log p. 

Together with Lemma 3.5 and (8.5), this implies the lemma, for y^p 2 log p, c = c(y) as above, and 
coo ^ min{ y/n/plog(n) 2 , yfn/y log(n) 2 }. □ 

8.1 Proof of Diagonal and Off-Diagonal Norm Bounds (Lemma 8.3) 

Here we prove Lemma 8.3. 

Proof of Lemma 8.3. We start with the easy parts. Note that n^Llliv, = nw 2 Hy 2 Lny 2 nw 2 , so the 
bound 

||n W2 Ln W2 || ^ O(arSz) 

is immediate from Lemma 7.2. The same theorem also implies that ||L|| C 0(cu 4 n 3 ^ 2 ), and since 
projectors are contractive this finishes the bounds on the diagonal terms and the first part of the 
off-diagonal bound (for cross terms among Wq, Wi, W 1 . 5 ). 
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We just have to prove that Hlliy LTlw 2 || ^ 0(a> 4 n). We will use the patterned matrix machinery 
to show this. We recall the decomposition of L from Section 7 as L = Lq + Li + L 2 . The main point is 
to show that HTbyLoL^II 0 (co 4 n), so we postpone to the end of the proof the case of Li and L 2 . 

Recall again that Lq can be further decomposed as Lq = 0(1) • LbgS ^0 ( a g a i n / see Section 7 for 
definitions). Finally for each L® there is a L® so that ||L® - L®|| ^ 0(<n 4 ?z) with probability 1 - 0(n _1 °) 
(see Claim 7.4). Since \B\ = 0(1), it is actually enough for us to show that 


n 




L 


j o 

iBeS 


n 


w 2 


^ O(orn ). 


Let B\ be the bipartite graphs on [2] X [2] for which at least one right-hand vertex has degree 0. 
Then by Lemma 6.9, LbgS T®Fbv 2 = 0/ since W 2 C y 2 - 

We are left with £> 2 , the family of bipartite graphs where every right-hand-side vertex has 
nonzero degree. Using Lemma 6.11 on the matrices L®, we get that if B has no vertices of degree 0 
then ||L®|| ^ 0(cu 4 ?z) with probability 1 - 0(n~ w ). Since as above ||L® - L®|| ^ 0(cu 4 /z) with similar 
probability, we get ||L® || ^ 0 (cu 4 h) for these B. 

The only remaining graphs in B are the two graphs B \, 82 with exactly one vertex on the 
left-hand side of degree 0. It is not hard to check that rows [a, b } of the matrices Lq 1 and L () 2 are 
matrices are rf 2 and r® 2 , respestively. Since W 2 _L rf 2 , we get 1 IT w-, = L^ 2 Uw 2 = 0. 

It remains to handle Li and L 2 . By Claim 7.5 together with Claim 7.4, each satisfies ||Li||, HL 2 II C 
O(co 3 n 3 ^ 2 ). Since co ^ yfn, the lemma now follows. □ 


8.2 Lower-Degree Cleanup (Lemma 8.1) 

In this section we prove Lemma 8.1. We start by bounding the difference between our pseudoexpec¬ 
tation E and the MPW operator on polynomials of degree less than 4. This lemma is a consequence 
of Lemma 5.7 and the Gershgorin circle theorem (Fact 3.3). 

Lemma 8.6. Let G ~ G(n, 1/2). Let co be a real parameter. Let E be as given in Definition 5.3. Let Eo 
be the MPW operator for clique size co. Suppose co ^ yjn and y = O(co). Then with probability at least 
1 - 0 (n~ 20 ), 

• Every i,j,k e [n] with i,j,kall distinct satisfies 

| E XiXjXj - EoXjXjXkl ^ 0(y log(n)<n 4 /n 4 ). 


• Every i,j £ [»] with i ± j satisfies |E XjXj - Eoyx ; j ^ o(co 2 /n 2 ). 

Proof of Lemma 8 . 6 . Recall that we obtained E by starting with the clique-size co MPW operator 
on multilinear homogeneous degree-4 polynomials, adding a correction operator on those same 
polynomials, and then infering values of E on lower-degree polynomials via the constraint ffi x i = 00 ' ■ 
There are two primary sources of the difference between our operator E and the MPW operator on 
polynomials of lower degree. The dominant one is the propogation to lower degree polynomials 
of the correction operator X (recall Definition 5.1). The second is that the degree-4 values coming 
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from the MPW part of our operator E are propogated downwards using the constraint Yn Xj = co' 
rather than Y, x i = co as is done to define the rest of the MPW operator. 

We start with the degree 3 bound. Denote by Eo the MPW operator for clique size co. Consider 
i, j, k € [n\ all distinct, and recall that 


ExiXjX k = /_ Y E XiXjX k x e 




= Y (Eq XiXjX k X C +LXiXjX k X() 


(*i,j,k 


co' - 3 co - 3 


j Y E 0 XiXjX k x { 

V*bj,k 


+ E 0 XiXjX k + /_ Yj ^ XiXjX k X(. 




Thus, 


E XiXjX k - E 0 XiXjX k = — 


co' -3 co - 3 


j Y EoXjXjX k x { 


+ 




XiXjX k X { . 


(*i,j,k 


By Lemma 5.7, with probability l-0(n -25 ) every i, j, k satisfies | Yc^i,j, k -C x i x j x k x f | E O(yco 5 log(n)/n 4 ). 
At the same time, we know by Lemma 5.5 together with Lemma 5.7 that \co'-co\ ^ O(yco 2 log(n) 2 /n 5 ^ 2 ). 
This implies that |1 /(co' - 3) - 1 /(co - 3)| ^ 0(y log(n) 2 /n 5 ^ 2 ) (all with probability at least 1 - 
0{n~ 25 )). In conjunction with the preceeding, it implies also that every /, j, k satisfies (1 /{co' - 
3))l2W-£ XiXjX k X (| ^ 0(y log(n)co 4 /n 4 ). Together with the trivial bound | Yr±i, h k%) x i x j x k x i\ ^ 
0(<n 4 /?z 3 ) with probability 1 - n~ w ^ (following from [MPW15, Theorem 10.3]), all this implies that 
with probability 1 - 0(n -25 ), 

| E XjXjX k - Eo x i x j x k\ Y 0(y \og{n) 2 co 4 /n n ^ 2 ) + 0{y log(n)ce 4 /n 4 ) = 0(y log(n)cu 4 /n 4 ). 

We turn to the degree-two bound. Fix i 4- j £ [n]. We expand E XjXj - Eq x i x j- 
E X{Xj - Eo XjXj 

= (a,--2)'(a,--3) £ “ ( a -2)( a -3) £ f » ^ 

k*i,j k*i,] 

= (a,--2)'(a,--3) W* ~ ~ ( (tt) _ ^ _ 3) “ ^-2)^-3) ) E 60 ***' 


With probability 1 - 0(n 20 ) when G ~ G(n, 1/2) by Lemma 5.7, we get that l/(a/ - 2)(&/ - 3) = 
0(l/<n 2 ). By the same. 


1 

{co - 2){co - 3) 


1 

{co' - 2){co' - 3) 


^ 0{y/con 5 ^ 2 ). 


Together with the bound from earlier in this proof on | E XjXjX k - Eo X{XjX k \ and the trivial bound 

| Ykti.j XiXjX k \ ^ 0{co 3 /n 2 ), we obtain 

| E XiXj - Eo XjXj\ ^ O{yco 2 /n 3 ) + O{yco 2 /n 9 ^ 2 ) 

which is o{co 2 /n 2 ) for co = o{ yfn) and y = o{co). □ 
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Proof of Lemma 8.1. We first observe that an eigenvalue lower bound on N implies the same on the 
(principal) submatrix indexed only by cliques (in this case, edges) in G. This submatrix is equal to 
C 4 N' + Err, where 


Err (7,J) = 


f C 4 E x ! xl - C 4 Eo jeV 

1 ° 


if \I U /I < 4 
otherwise 


We break Err into two parts so that Err 2 + Err 3 = Err: 


Err 2 (/,/) = 


Err (/,/) 
0 


if |J U /I = 2 
otherwise 


and 


Err 3 (I,J) = 


Err(fJ) 

0 


if \I U J\ = 3 
otherwise 


Note that Err 3 consists of the off-diagonal nonzero entries of Err, while Err 2 contains the diagonal 
entries of Err. We start by showing a bound on ||Err 3 ||. 


l|Err 3 || ^ max V |Err 3 (7,/)| 

M2) jti 

= max V |Err 3 (7, {k, q })| + |Err 3 (7, {k, z 2 })| 

I ^Ma^tll 


(definition of Err 3 ) 


^ 0(n) ■ max C 4 IE - Eo x(xjxf\ 

i,j,k all not equal 

^ O(n) • C 4 • 0(y log(n)oi 4 /n 4 ) w.p. 1 - 0(n ~ 20 ) by Lemma 8.6 
^ 0(y log(n)cu 4 n) w.p. 1 - 0(n~ 2 °) by C 4 w n 4 , see [MPW15, Theorem 10.3] 


Next we bound Err 2 . Since it is diagonal, it is enough to give an entrywise bound. 

||Err 2 || ^ maxErr 2 (7,7) 

H’i) 

= max C 4 (E XjXf - Eo XjXj) by definition of Err 2 

^ C 4 • o(co 2 /n 2 ) w.p. 1 - 0(n~ 2Q ) by Lemma 8.6 
^ o(co 2 n 2 ) w.p. 1 - O(n~ zo ) by C 4 « n 4 , see [MPW15, Theorem 10.3]. 

Fix y, c 6 M. Suppose N > (co 2 n 2 /c) ■ I. Then for N + Err to be PSD it is enough to have 
||Err|| ^ a> 2 n 2 /c. There is by the above bounds a universal constant C so that it is enough to have 
Ccy log(n)cu 4 n y co 2 n 2 , or rearranging, co ^ \]n/Cc log(«)y. □ 


8.3 Eigenvalue Lower Bound for the Correction 


The following is the main claim for this section. 
Lemma 8.7. Let G ~ G(n, 1/2). Let 


fs ~ j 
ifs 00 j 
ifs = j 


Hi) = - 


1 

-1 

0 
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Let rf 2 e m( [ 2 ] ) be the vector with entries rf 2 ({i ,/}) = r s (i)r s (j). Let V = Span{rf 2 } se [„], Let Ely be the 
projector to V. With probability at least 1 - 0(n~ LO ) over the sample of G, 

£(rf 2 )(rf 2 ) + >Q(n 2 )Tly. 

S 

We will need the following graph theoretic machinery for the moment method bound. 

Definition 8 . 8 . Let G be a graph on [n]. A diamond ribbon R of length 2t is a graph on vertices 
Si,..., s { , h ,..., t { , «i,..., u 2( , v x ,..., v 2t . It has edges 

(Si , u 2 j), (S;, V 2 j), ( U 2 i , tj), (v 2 i, ti), (tj, U 2 i+i), (ti, V 2l+ \), («2i+l/ Si+l), {v 2 i+l,Si+\) 

where addition is modulo It. 

A labeled diamond ribbon (R,F) is a a diamond ribbon of length 2 1 together with a labeling 
F : R —> G of vertices in R with vertices from G. We insist that for (x, y) 6 R an edge that F(x) + F(y). 

The labeled diamond ribbon (R,F) is contributing if no element of the multiset 
[(F(x), F(yj) such that (x, y) 6 R} occurs with odd multiplicity It is disjoint if the sets 

{F(sOUF(h)MF(u,)} 

i odd/ {F(u,')}i even/ {F(Oj)}j odd/ |F(C’i )}i even 


are disjoint. 

Lemma 8.9. Let (R,F ) be a contributing disjoint labeled diamond ribbon of length It. Then it contains at 
most 3 1 + 0 ( 1 ) distinct labels. 

Proof By our disjointness assumption, every element of the multiset {F(s z ), F(f z )j must occur with 
multiplicity at least two and similarly for {F(m z )} and {F(y,)}. □ 

Proof of Lemma 8.7. Note that the matrix R = L s ( r f 2 )( r f 2 ) + has row and column spaces both V. Note 
also that it factors as SS + , where S is the (^) X n matrix whose columns are the vectors rf 2 . Thus, it 
will be enough to show that 

S + S > Q (n 2 )I w.p. 1 - 0 (m“ 10 ) 

where here I is the n X n identity matrix. 

For this, consider the matrix S + S indexed by vertices s, t <E [nj. It has entries 

S + S(s, t) = (rf 2 , rf 2 ) 


and in particular, 

S + S(s,s) = (rf 2 , rf 2 ) = 

So, zeroing this matrix on the diagonal, it is enough to prove that 


S + S - ( [ " ] 1 1 


^ o(n z ) w.p. 1 - 0 (n 10 ) 
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Let H := S + S - (^)L Let H t y for i ± j be given by 


Hi/s, t) = 


r s (i)r s (j)r t (j.)r t (j) 

0 


if s ± t 
otherwise 


Then H = Yi±j Hi,j- Note that Hj r j(s, t) = 0 if i € ]s, t} or j € ]s, f}. Thus the obvious generalization of 
Lemma 6.14 to the two-parameter family H h j applies. This gives us a family of matrices H 1 ,..., H r 
for some r = 0(log n) and a corresponding family of partitions (S|, S\, S^, S S^, S^),..., (S r v ..., S'f) 
of [»]. 

These will be such that Yi =i b/' = b/, and 

H\s,t)= Yj r 5 (i)rs(k)r t (j)r t (k) 

( msi' 

where cS'X Sg is the subset giving indices so that the corresponding summand has not occurred 
in any i" < i. We will bound E Tr(H') 2/ for some C to be chosen later. Every term in the expansion 
of this quantity corresponds to a disjoint labeled diamond ribbon of length 2C, and the number 
of nonzero terms is at most the number of contributing disjoint labeled diamond ribbons. So 
ETr(H ! ) 2f ^ n 3 ^ + °(i). The rest follows by standard manipulations. □ 


8.4 Proofs of Remaining Lemmas 

Proof of Lemma 8.2. By Lemma 4.10, 

E > Q(<n 4 n 2 ) • TIw 0 + Q (co 3 n 2 ) ■ Tly^ + Q (co 2 n 2 ) • IIw 2 • 

Let W = Wo ® Wi. By Lemma 8.7 and [MPW15, Theorem 10.3] (saying that C 4 ~ n 4 ), with 
probability 1 - 0(n~ ll) ) we get that %) > Q(ycu 5 n)n Span j r ® 2 |. We make the observation that %) = 
(ITvv + + n Wl5 ) and that nw^ngp^j^jLIwjg = Llw 15 . So we just need to handle the 

term nyv^onwjs + Hvy ^onw- 

Together, Lemma 6.12 and Lemma 5.7 imply that H^oll ^ 0(yafn log(/;) 2 ) with probability 
1 - 0(n~ w ). Thus to ensure that ||^oll ^ °(^/co 3 n 2 ■ yco 5 n) it is enough to choose cu ^ yfyEI log(n) 2 . 
This is enough to apply Lemma 3.5 and conclude the proof. □ 

Proof of Lemma 8.4. Note that for I,J disjoint we have A(/, /) = 0. We bound the maximal sum 
across any row of A. With probability 1 - O(n~ j0 ) every off-diagonal entry off A is at most 
O(co 3 yfn log(n) 2 ) in absolute value [MPW15, Theorem 10.1]. For each I e (^), we then get 
E/*ilA(J,J)| sC O(o > 3 /; 3 ' 2 1 og(/t) 2 ). At the same time. The diagonal entries are each at most 
0(<u 2 n 3 / 2 log(tt) 2 ) with similar probability, again by [MPW15, Theorem 10.1]. □ 

Proof of Lemma 8.5. % and Kq differ in two respects. We first bound the spectral norm of the part of 
on non-disjoint entries. Let 


K ( o\lJ) = 


Mil) 

0 


if \1 n /I = 3 
otherwise 
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It follows from Lemma 5.7 and a row-sum bound that ||!R®|| ^ O(yo > 5 \fn log(n) 2 ) with probability 
1 - 0(n~ l °). A similar analysis holds for the analogous matrix < R^■ 

Let *Rq be given by 

<Rq(J J) = ^ se[,!] r ^ 5C4 ' n « e (' u /)\('n/)Ls(«) if \I Pi J\ = 4 

lo otherwise 

Now it is enough to bound Wo - 7?||. This is the deviation introduced by zeroing non-clique entries. 
Note that each entry I, J of 'R can be decomposed as a function of the underlying graph edge 
variables: 

nm = n+yc 4 (f t ) 5 ^ £ £ n ^ n rs{u) ' 

nonempty SQ& ex t{l,J) s ( u,v)eS ueluj 

where we recall that for an edge (u , v), the variable g( u ,v) is the +1 indicator for that edge. Each 
of the entries in the sum over nonempty S above corresponds to a matrix of the form bounded 
in Lemma 6.12, where we conclude that each has spectral norm at most 0(n 3 ^ 2 log(n) 2 ) with 
probability 1 - 0(n~ l °). We conclude (also using C 4 « n 4 , see [MPW15, Theorem 10.3]) that 
Wo ~ ^11 ^ 0(yco 5 y/n log(n) 2 ) as desired. □ 


9 Concentration of deg G (I) 


In this section, we prove the following large-deviation bounds on the number of ^-cliques a 
random G(n, 1/2) graph contains and on deg G (f). Similar results (which are likely sufficient for our 
needs) appear in the literature; see [Ruc 88 , VuOl, JLR11] for instance. We provide these proofs for 
completeness. A coarser concentration result for deg G (I) appears in [MPW15]. 

Definition 9.1. For a graph G, define N X (G) to be the number of x-cliques in G. 

Unless otherwise specified, in this section G ~ G(n, 1/2). 

This first theorem gives the large deviation bound for the number of cliques of size x in G. 
Theorem 9.2. For all e e (0, l),/or all x, ifn > x 2 (2e - elne)(2e + 2 - elne) then 


N X (G) - 2“© 


> e (2 - In e)—n 
x\ 


x-l 


< £. 


(Note that 2“©(”) = EN X (G).) 

We also want a large deviations inequality for the number of cliques of size 2d that a clique of 
size d' < 2d participates in in G. Moreover, to carry out the eigenspace splitting arguments needed 
for Lemma 7.3, we want to know the dependence of this deviation on the number of vertices 
adjacent to every vertex in the ri'-clique. The following theorem serves both these purposes. 

Definition 9.3. Given any I c [n], let A/ be the set of all vertices not in I which are adjacent to all 
vertices in I. 
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Theorem 9.4. There is a universal constant C so that for any I c [ n\ of size at most 2d, if I is a clique in G 
then for any e € (0,1), if n ^ 100d 2 2 2rf (3 - In ej 1 then 


deg G (D - 2 ( 2 )“( 2 ) 


2 d\ n 


2d 


Ifl 

-\I\ 


> C(3 - In e)n d 2 


< £ 


More precisely, if |/| < d then 


deg c (/ ) -2(5)-ffl( 2 7_ l J | )-( 


\\Ai\ 


C2 2d (3 - In e) 2 n 2d ' J| 1 ] < e. 


n - |/|\ 2 (T)-( 2 )(n - UD^-W - 1 

2 W I (2d - |J| - 1 )! 


The key lemma in proving Theorem 9.2 is the following, which bounds how often subsets of G 
of size x share (potential) edges. 

Lemma 9.5. Ifx ^ 2 and n ^ x 2 q(q + 2) then there are at most 2n xc t~t multi-sets S = {V\, ■■■ , V q } of 

subsets Vi c [n] of size \ Vi\ = x such that that for all j there exists an i t j such that \ Vi n Vj\ ^ 2. 

Using Lemma 9.5, we can bound the deviation of the number of x-cliques in G from its expected 
value; we carry this out now. 

Definition 9.6. Define X = Lv : vcg, \v\=x{}v ~ 2~(^ where ly = 1 if V is a clique in G and 0 
otherwise. 

Proposition 9.7. X = N X (G) - 2 _ ( 2 )(”) 

Proof By observation. □ 

Corollary 9.8. Ifx ^ 2 and n ^ x 2 q(q + 2) then E[X t? ] < 2q\ (f \ nXl ) 1 ■ 

Proof. By Proposition 9.7, E[X^] = £ v v -,v q -. E [ n 7 , (1 v, ~ 2 _ ©)|. Note that all terms of this 

sum have value less than 1. Furthermore, for all nonzero terms in this sum, for all j there must be 
an i such that V, n Vj\ ^ 2, since the sets V, and Vj must share a potential edge in order for ly ; and 
ly. not to be independent. Thus, this sum is at most the number of ordered multi-sets of q x-cliques 
{V\, ■ ■ ■ , Vq\ where for all j there is an i such that | V l Pi V,\ ^ 2. In turn, this is at most q\ times the 
number of unordered multi-sets of such q x-cliques. By Lemma 9.5, this is at most 2 ql as 

needed. □ 


We are now ready to prove Theorem 9.2. 


Proof of Theorem 9.2. The result is trivial for x = 0 and x = 1 so we may assume that x ^ 2. Using 
Corollary 9.8 and Markov's inequality. 


U> £[x,1 l 

^ P 

v,. 

L £ J 


£ 


|X|> 
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Thus, we just need to give an upper bound on min( pos i t j ve even q|! For all positive even q, 

2q\ T qf so this expression is upper bounded by -4=. We now try to minimize -|= over all positive 
even q. Taking the derivative of this expression with respect to q yields -^= + Setting this to 0 

yields q - - In e. However, we require q to be even so we take q to be the smallest positive even 

1 

integer which is greater than -lne. Now q < 2-\ne and i[e T (_ln ^£ = (V n 'j = j. Putting 
everything together, for this q, -v /^ 1 ^ -4= < 2e - e\n e. Plugging this in gives 


£ > P 

fix, > 

^ P 

[ y2 

|X| > e(2 - In t )—n x 1 


V £ XI 


xl 


All that is left is to check that n T x 2 q(q + 2) for this q to make sure that our application of Corollary 
9.8 was valid. Since n > x 2 (2e - e In t)(2e + 2 - e In e), this holds, as needed. □ 

Now that we have proven Theorem 9.2, we will derive Theorem 9.4 from Theorem 9.2. The 
idea is that conditioned on I being a clique, by Theorem 9.2, deg G (/) is primarily determined by |Aj|, 
which can be easily shown to be tightly concentrated around its expected value. We start with the 
following lemma 

Lemma 9.9. Ifn>d then for any I c [n\ of size less than d, if we first determine all of the edges incident to 
elements of I (which determines Af then if I is a clique, when we look at the remainder of the graph, for any 
n e (0,1), 




p( deg c 


10(2 - In £l )n d -V I " 1 + (d - \I\) 2 (- 1 

\n - |/| 


so long as the following conditions hold: 


1 . ( d-\I\) 


2 W \A,\ 

n~\H 


sC 1 


2. |Aj| > d 2 (2c - cln£i)(2(? + 2 - elnei) 


\ - iiiy'-w- 1 . 

/ (d-\i\-iy. > 

\ 2 2®-© ( „-|i|y*-Wv 
) 0- HI)! ) <£1 


To prove Lemma 9.9 we require the following results; proofs of the more elementary ones are 
deferred to Section 9.1. 


k 7 2 k 

Proposition 9.10. For all nonnegative integers n and k where k < n, 0 < - (f) ^ fr 

Proof of Proposition 9.10. Note that n k ^ ]ly=o ( n ~ j) ^ nk Uq=l (1 ~ \f)> n k ( 1 - Ly=o «) ^ ~ H) 

This implies that 0 ^ n k - II/=o ( n _ f) ^ Tq nk an d dividing everything by k\ gives the claimed 
result. □ 


Lemma 9.11. Ifn >d>\I\ then 2©-© - 2©"© (”:£') < 
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Proof of Lemma 9.11. Applying Proposition 9.10 on n - |/| and d - |/| gives 

< (n - |l|) rf “ |I|_1 < n rf_|:|_1 


(n - \I\) d ~ w 

In - |J|\ 

^ (rf-|f|) 2 (n-UD^I 

(d-W 

\d-\I\) 

1 

l 

rT 

!/ 


Multiplying this equation by 2^)~Q gives the claimed result. □ 

Proposition 9.12. For all nonnegative integers x and d where x ^ d, x(d - x) + Q - (£) = —( j*)- 
Proposition 9.13. For any nonnegative integer k and any x such that \kx\ ^ 1, |(1 + x) k - (1 + kx) | ^ k 2 x 2 
Eventually in the course of proving Lemma 9.9 we will break 


de g G W 


9 ( 2 )—©" |;| \ -/mi- n ~W \ ~ |f| ) d 171 1 

\d-\I\) V 11 2l'l j (d-\I\-l)\ 


into pieces. The following lemmas offer the necessary bounds on each piece. 


Lemma 9.14. If (d - |f|) 


2 lfl |A/l 


n- III 


- 1 


^ 1 then 


_(rf-ui) \Ai\*-W _ 2©-© (n - \I \) d ~ |:| _ 


f,,| n - \I\ \ 2 ( T)-©(n - 

(d-|f|)! (d - |I|)! I' 11 2W / (d - |J| -1)! 

,2 _ / , ui\ /'rf' 


^ (d - m ) 2 


/2lVil \ 2(2)-Ci)(n - \I\) d ~W 


\n-\I\ 


(d - |I|)! 


Proof of Lemma 9.14. Applying Proposition 9.13 with x = —^ - 1 and k = (d - \I\), since (d 


Ifl) 


2 |J| |A t | 


- 1 


^ 1, 




(n - |1|)‘H J 


1 - (d - |J|) 


2 |j| |Aj 1 

n-\I\ 


<(d- |f|) : 


;/2 |J| |A I | 

\n-\I\ 


2\2)~(2) l d-\I\ 

Multiplying this equation by 2 - — and using Proposition 9.12 with x = |/| gives 


- ( wt)\Ai\ d -W 2 ( 2 )“( 2 )(n - |f|) rf -l J l /2 |J| |A Z | \2(5)-©(: 

(d-|l|)! (d- |J|)! ( I (d- 


n - \I\) d ~W 


p-j/i-, lA^-W 2 ( 2 )-©(n - |f|) d -l ; l 

“ (^j! 


|f|)! 


n-|f|\2( III 2 1 H2)(n-|J|) d -l- r l-i 


(d - |J|)! 

^(d-|f |) 2 


(^) 


(d - |J| - 1 )! 


2 HU, _ \ 2 2©"©(n - UiyHfl 


n - 


(d - |I|)! 


□ 


Lemma9.15. f/|A/| > d- |J| then 2~( 2 -2"( 2 < 2“( 2 


(d-|l|)! 
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Proof of Lemma 9.15. Applying Proposition 9.10 on |Aj| and d - |/| gives 


lA^-m 

-( |A/I ) 

(d ~ HI)! 

\d-\I\J 


m\ (d-\m 

Multiplying this equation by 2“( 2 ) gives the claimed result. 

Lemma 9.16. For all e\ e (0, 1), if\Ai\ > d 2 (2e - eln £’i)(2e + 2 - elnfi) then 


□ 


deg G (f) “ 2 


_2 -('?)/ 1/1,1 
d-lfl 




< £1 


Proof of Lemma 9.16. This lemma follows immediately from applying Theorem 9.2 on the random 
graph G restricted to the vertices Aj. □ 


Proof of Lemma 9.9. We break 




n-\I\\2(' I] 2 1 )-(i)(n-\I\) d -^- 1 


(d-\i\-iy. 


into four parts and analyze each one separately. 

1 2 ( l 2 l )-( 2 )M^_ 2 ( l ?)-©f”-l I l) 

(n-\I\) d ~W _ 2(' fl 2 +1 )-(2)(„-|i|y-lfl-i 

2 ‘ 2 (d-|7|)! 2 (d—|J|)! 2W ) (d-m-1)! 

3. 

4. deg G (J) - rfflO 

Combining Lemma 9.11, Lemma 9.14, Lemma 9.15, and Lemma 9.16, we have that under the given 
conditions. 


'( 


v\\_(d\ln - |!| 

d-\I\ 


deg G (l)-2©-© 

2(2)—(2 ) 1 + (d- |f |) 2 


H-|l|\2( li 2 1 )-(2)(n-|l|) rf WH 


K^) 

/ 2 |J| |Aj| _ \ 2 2 ( 2 )-©(» - | 1 |)' 


(rf-m - 1 )! 


> 


\n-\I\ 


(d-m 


-+ 


+ e(2 - In < £1 

The result now reduces to showing the following equation 

^MV-I'I- 1 + 2 -( d 2 ] )\A I \ d - m ~ 1 + e (2-ln £l )^|^|A/-l J '- 1 < 10(2-In 

which follows from the facts that \I\ < d, \Ai\ C n, and C 2. 


□ 
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To use Lemma 9.9 to prove Theorem 9.4, we need probabilistic bounds on |Aj|. 

Lemma 9.17. There is a universal constant C so that for all £2 £ (0,1), 

P [||Ai| - 2-1 \n - |J|)| > C(2 - In e 2 ) V^] < £2 

Proof. The lemma follows from standard concentration of measure. If we let Xj be 1 if i £ I 
and i is adjancent to all vertices in I and 0 otherwise then 1 x, = Aj. The expected value 
of A I = E”=i Xi is 2“I 7 ' (n - |f|), so by Bernstein's inequality there is C so that for all £2 £ (0,1), 
P [\A, - 2-l J l (n - |f|)| > C(2 - In e 2 ) V^] < £ 2 - □ 

We have all we need now to prove Theorem 9.4. 


Proof of Theorem 9.4. The result is trivial if HI = d and follows immediately from Theorem 9.2 if 
|f| = 0 so we may assume that 0 < |f| < d. 

In what follows, C and C denotes universal constants which may vary from line to line. Now 
recall that by Lemma 9.9, for any £1 € (0,1), 

- |7|\ 2( m 2 1 )-(z)( n - 

I — I I At 74 — — 

HI) 


P i 




C(2 - lnt’i )ri 




+ (d-\i\y 


(d-\i\-iy. 


(2 w \Ai\ \ 2 2(2)-(z)( n - \!\) d -^ 


> 


\n-\I\ 


- 1 


(d - HI)! 


< £1 


so long as the following conditions hold: 


1 . (d-\I\) 


2 m \A,\ 


n-\I\ 


-1 


^ 1 


2 . |Aj| > d 2 (le - eln£i)(2e + 2- e\n£\) 

Taking £\ =£2 = 5 , plugging Lemma 9.17 into these equations and using the union bound, we have 
that 


p ( 


deg G (I)-2®-® 


d\ln - |i| 

d~\I\ 


- |Aj|- 


n - |J|\ 2(^)-©(n - UiyHfl - 1 


2ih 


(d - |J| - 1)! 


> 


C(3 - h + „ - ( ^-.^>5 


< £ 


so long as the corresponding conditions hold. Assuming these conditions hold for now, since 

HI < HI < d, 


(d-W 
(d - HI) 


M < 2, and 21 ^ 2 ( 2 ) = 2 ( III 2 +1 ), 


2 / 2^C(3 - In £•) y/n \ 2 2@ (t)(n-\I\) d 1/1 < c , 2 |J| (3-ln f) 2 n ^ _ ^ d -\i\-i 


n - HI 


(d - HI)! (n - HI) 

< C-2 d (3-ln£)V Hi,_1 


Plugging this in we have that 


p ( 


deg G (I)-2 ©-© 



n - HI \ 2( |I| 2 +1 ) ®(n - \I\) d W 1 
2 lh I (d - HI - 1)! 
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C2 rf (3- lne) 2 jz d_|i|_1 ) < £ 


as needed. For the first part of Theorem Theorem 9.4, note that this implies that 

P(deg G (/)-2©-©(^: 


> 


|Aj|- 


2 ih 


(d- m - 1)! 


+ C2 d (3-ln£)V" |Ihl ) < £ 


Plugging in Lemma 9.17 and noting that 

- 2 ( m 2 1 )—(2) (n - |j|)d-| 2 f-i 


C(3 - in£) ryi < C '(3 - 1 n£)H rf - |:| -5 


we have that 


P( deg G (I) - 2(z)"©|” _ jjjj > C(3 - ln£)n d “ |Ih 2 + C'2 rf (3 - In £)V“ |J1_1 ) < £ 

Taking n ^ Cd 2 2 2d (3 - ln£) 2 and d '^2, 

C2 d (3 - In £)V _|i|_1 < C'(3 - ln£)n d “ |:| “5 


Plugging this in gives that 



> C(3 - ln£)/z rf ^ 2 ) < £ 


as needed. All that is left is to check the conditions for Lemma 9.9, which are as follows. 

1. (d-ifi) 2 "'*L;i ,) ^ 1 

2. 2~^(n - |f|) > d 2 e(3 - In£’)(3e + 2 - eln£) + e(3 - In£) yfn 

These conditions are true if n ^ 4d 2 2 2d (3 - In e) 2 . To see this, note that since d > |/| > 0 and |/| < ^ 

1. (d - \I\)2^e(3 - In c) y/n ^ d2^e(3 - In £) yfn - |f| ^ | y/4d 2 2 2d (3 - In e) 2 yfn - |f| ^ n - |f| 

2. 2^d 2 e(3 - In £’)(3e + 2 - elnc) < 2^d 2 e 2 H^(3 - ln £) 2 < 10 • 2^d 2 (3 - ln £) 2 ^ f^n 

3. 2^e(3 - In £) V« < 4 • 2i : '(3 - In e) (2d2 d (3 - In £)) ^ \ 

Dividing the first statement by n - 1/| gives the first condition. Using the second and third statements, 

2^d 2 e(3 - In £)(3e + 2 - e In £) + 2^e(3 - In £) yjn < —n <n-\I\ 

16 

Dividing this by 2^ gives the second condition, as needed. □ 
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9.1 Proofs of Remaining Lemmas 

Proof of Lemma 9.5. 

Definition 9.18. For each multi-set S = {V\, ■ ■■ , V k ] of k x-cliques, define the constraint graph Hg 
as follows. 

1. V(H S ) = [V lr ■■■ ,V k ] 

2. E(H S ) = {(Vi,Vj) : \Vi n Vj\ ^ 2} 

Let's first bound the number of S such that Hg is connected. 

Lemma 9.19. For any x, k ^ 2, there are at most n kx ~ 2k+2 k\ ^ multi-sets Sofk x-cliques such that 

Hg is connected. 


Proof. Since Hg is connected, we can order { V \, ■■■ , V k } so that for all j > 1 there is an i < j such that 
(Vi, Vj) e E(Hs). Assuming this is the case, we have at most (") choices for V\. For each j > 1, there 
are two vertices in Vj which are contained in some i where i < j. There are at most j choices for this 
i and then there are (^) choices for which two vertices of V; are contained in V j. There are at most 
( z ” 2 ) choices for the other x-2 vertices of Vj so for each j > 1 there are at most choices f° r 

Vj. Putting everything together, there are at most 




4 x-2 

x*n x z 
2(x\) 



<: n kx ~ 2k+2 k\ 


2 

x 4 



multi-sets S of k x-cliques such that H$ is connected. 


□ 


Now consider the number of multi-sets S of q x-cliques such that Hg has t connected components 
with sizes si, ■ ■ • ,Sf. Using Lemma 9.19, there are at most 


ii(— 2 4 

i=i v x 




such S. 

We now total this up over all possible f,Si, ■ • • ,s*. For the special case that all connected 

( 2 \Cj 

such S. We will show that this term 

contributes more than all of the other terms combined, which implies that the total number of S is 

( 2 

|jj , as needed. 

For a given t, Y\\=\ S;! ^ 2 f ~fq + 2- 21)\ ^ 2 f (q + 2)t~ 2t . Also, each of the t components of H$ 
must have at least two vertices so to determine the sizes S\, ■ • ■ , S/ it is sufficient to decide how to 
distribute the q - 2t extra vertices among the t connected components of Hg. There are at most 
C*- 1 ) ^ q t] 11 ways to do this. Thus, the total contribution for terms of a given t is at most 


2 t (q(q + 2)r 2t n x ^ +2t 





Since n ^ x 2 q(q + 2), the n xq q 
other terms combined, as needed. 


term which comes from t — | contributes more than all the 

□ 
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Proof of Proposition 9.12. Rearranging this equation gives ( d ) = (£) + dx + ( d 2 x ) which just says that if 
we want to pick two elements in [1 ,d] we can either pick two elements in [ 1 , x], one element from 
[ 1 , x] and one element from [x + 1, d], or two elements from [x + 1, d], □ 


Proof of Proposition 9.13. 


\(l+x) k 



□ 


10 Optimality of MPW Analysis 

In this section we sketch an argument due to Kelner showing that the MPW moments are not PSD 
when co :» n 1 K d+1 \ 

Theorem 10.1. With high probability, the MW moments are not PSD when co » na+i. In particular, if 
co :» um then for all s,for some appropriately chosen C, with high probability, 

E[(Cco d x s - 11< 0 

I:lQV,\I\=d iel 

with high probability. 

Proof Sketch. We will be using the following proposition heavily. 

Proposition 10.2. For all I c V(G) such that |lj < 2d, - (co - |!|)E[x;] 

Proof. We have the equation that x j = oo so 

j] = \I\t[xi] + F^x^ j] = <u£[x : ] 

i 


and the result follows. 

Corollary 10.3. For all I c V(G) and all m such that |f| + m ^ 2d, 


□ 


*[ Y, x ’ ] 

J:lcj,\j\=\l\+m 


m=o(co-\i\-x) 


m\ 


F[x,] 


CO - HI 


m 


F[x,] 


Proof. This result follows from repeatedly expanding out (co - |Ki|)E[xj<:] = L^k^[ x ku/]- For any 
given / such that I c / and |/| = |/| + rn there are ml different ways to reach J from I which gives us 
the ml term. □ 
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With this corollary in hand, we expand out E[(Cco d x s - Y hev\m, fife/ r s (z)x,) 2 ], obtaining 


HCV\(s), 
HI =d 


m=i/i=d 


E[C 2 co 2d x s ] - 2Cco d E[ ^iu{s)] + E[ ^ r s (0 

Using Corollary 10.3, 


ielAJ 


Xluj] 


2Cco d E[ £ x IU[s) ]=2Cco d r d 1 )E[x s 

MGV\|s}, ' 

HN 


Thus combining the first two terms gives 


C 2 co 2d - 2Ca> d l M d nlEfe 


From our concnetration bounds on degc{s), with high probability E(x s ) is ^(1 + 0(—4^)). Thus, 
taking C to be a sufficiently small constant (which will depend on d), with high probability the first 

/ C \ 

two terms are ) 

Analyzing the third term is trickier. For the third term, taking K = IAJ for each term. 


* L n A(0 U/uj] = l l ru® 


\ieIAJ 

mn=i 


x=0 IMI£V\W, [ielAJ 


E[x/u/] 


mi\=d,\IAJ\=2x 

d 

L L 

X=0 K-KQV \{s), 
\K\=2x 


co-d-x 
d - x 


Hu (0 

V ieK 


E[x k ] 


We will now analyze the expected value and variance of this expression. Flowever, before doing 
so there is a subtle issue we must deal with. £[xx] is not completely independent of dl/eK h(0)- 
What saves us is that the dependence is small enough to be negligible. 

For each K c V(G) \ {s}, define yx to be the expected value of E[xk] if we preserve all of the 
edges of G which are not incident with s but reselect the edges of G incident with s randomly. From 
our concentration results on degQ(K), with high probability, for all K, |£[x/<] - y/<| is 0( w J^+^ ). We 
now write 


L L 

X=0 K:KQV\[s], 
]K\=2x 


co-d-x 
d - x 

u 

L L 

X=0 K:KQV\[s I, 
\K\=2x 

d 

L L 

X=0 K:KcV\{s}, 
\K\=2x 


|^[ g (0 


JeK 


E[x k ] 


co-d-x 


l 


d - x 

co-d-x 
d - x 


Vk 


n 

f ieK 

( 

n 

V ieK ) 


(E[x k \ - y K ) 
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For the second part, for each x E [0, d] there are at most ( 2 ",) K such that K C V(G) \ {s} and \K\ = 2x. 
Thus, 


E E 

X=0 K:KCV\( S |, 
\K\=2x 


co - d - x 


\ ~[ ''.<(/) (E[x k ]-xjk) 


n\(co-d-x\ ,~ r , 

max £[**]-yjd 
2x \ d-X K:\K\=2x 


From our concentration bounds, with high probability, for all x the corresponding term on the right 

^ jCi i x ^ d~\~x 2d 1 1 

is 0(^^) which is O(^). For all x ^ d, this is much smaller than Q( ^—). Thus we may ignore 
the second part. 

For the first part, the values r s (i ) are completely independent of the values i/k- When we take 
the expected value of 

i e (“j.'rlM* 

X=0 K:KQV\{s), 

\K\=2x 


1> S (0 


over the edges incident to s, only the x = 0 term remains and we obtain (°’ d ). When we take the 
variance of 


E E 

X =0 K:KQV\{s), 
\K\=2x 


CO - d ~ X 


11^(0 


over the edges incident to s, only the square terms remain so we obtain 


E E (El-dV 

X=0 X:KCV\| S ), \ / 

|jq=2* 

From our concentration bounds on the with high probability this is 


co - d 


co-d-x' In / co 4 


d-x \2x \n 4x 


which is 


Thus, with high probability 


co - d 


E E EEr)(EML 

|jq=2x 

is C’ d d ) (l ± O(^)) which is O(co d ). Putting everything together, if co » na+c then for some 
appropriately chosen C, with high probabbility, E[(Cco d x s - Jli:icv,\i\=d Vhei n s (z)x,) 2 ] <0 □ 


[]/s(o vk 
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